Geniusrise is a modular, loosely-coupled AI-microservices framework.
It can be used to perform various tasks, including hosting inference endpoints, performing bulk inference, fine tune etc with open source models or closed source APIs.
The framework provides structure for modules and operationalizes and orchestrates them.
The modular ecosystem provides a layer of abstraction over the myriad of models, libraries, tools, parameters and optimizations underlying the operationalization of modern AI models.
Together the framework and ecosystem can be used for:
Rapid prototyping by hosting APIs on a wide range of models
Host and experiment on local and iterate fast
Deploy on kubernetes to production
Building AI-side components using the framework and CLI
Build complex AI microservices using multiple models
Iterate fast from development to production
Manage, scale and monitor deployments in production
Build once, run anywhere
Using the ecosystem as a library: Many interesting applications can be built using this, e.g.:
The Geniusrise framework is built around loosely-coupled modules acting as a cohesive adhesive between distinct, modular components, much like how one would piece together Lego blocks. This design approach not only promotes flexibility but also ensures that each module or "Lego block" remains sufficiently independent. Such independence is crucial for diverse teams, each with its own unique infrastructure and requirements, to seamlessly build and manage their respective components.
Geniusrise comes with a sizable set of plugins which implement various features and integrations. The independence and modularity of the design enable sharing of these building blocks in the community.
Task: At its core, a task represents a discrete unit of work within the Geniusrise framework. Think of it as a singular action or operation that the system needs to execute. A task further manifests itself into a Bolt or a Spout as stated below.
Components of a Task: Each task is equipped with four components:
State Manager: This component is responsible for continuously monitoring and managing the task's state, ensuring that it progresses smoothly from initiation to completion and to report errors and ship logs into a central location.
Data Manager: As the name suggests, the Data Manager oversees the input and output data associated with a task, ensuring data integrity and efficient data flow. It also ensures data sanity follows partition semantics and isolation.
Runner: These are wrappers for executing a task on various platforms. Depending on the platform, the runner ensures that the task is executed seamlessly.
Task Classification: Tasks within the Geniusrise framework can be broadly classified into two categories:
Spout: If a task's primary function is to ingest or bring in data, it's termed as a 'spout'.
Bolt: For tasks that don't primarily ingest data but perform other operations, they are termed 'bolts'.
The beauty of the Geniusrise framework lies in its adaptability. Developers can script their workflow components once and have the freedom to deploy them across various platforms. To facilitate this, Geniusrise offers:
Runners for Task Execution: Geniusrise is equipped with a diverse set of runners, each tailored for different platforms, ensuring that tasks can be executed almost anywhere:
On your local machine for quick testing and development.
Within Docker containers for isolated, reproducible environments.
On Kubernetes clusters for scalable, cloud-native deployments.
Using Apache Airflow for complex workflow orchestration. (Coming Soon).
On AWS ECS for containerized application management. (Coming Soon).
With AWS Batch for efficient batch computing workloads. (Coming Soon).
With Docker Swarm clusters as an alternative orchestrator to kubernetes. (Coming Soon).
This document delves into the core components and concepts that make up the Geniusrise framework.
Because of the very loose coupling of the components, though the framework can be used to build very complex networks with independently running nodes, it provides limited orchestration capability, like synchronous pipelines. An external orchestrator like airflow can be used in such cases to orchestrate geniusrise components.
The Geniusrise framework is designed to provide a modular, scalable, and interoperable system for orchestrating machine learning workflows, particularly in the context of Large Language Models (LLMs). The architecture is built around the core concept of a Task, which represents a discrete unit of work. This document provides an overview of the architecture, detailing the primary components and their interactions.
A task is the fundamental unit of work in the Geniusrise framework. It represents a specific operation or computation and can run for an arbitrary amount of time, performing any amount of work.
State Managers play a pivotal role in maintaining the state of tasks. They ensure that the progress and status of tasks are tracked, especially in distributed environments. Geniusrise offers various types of State Managers:
DynamoDBStateManager: Interfaces with Amazon DynamoDB.
InMemoryStateManager: Maintains state within the application's memory.
PostgresStateManager: Interfaces with PostgreSQL databases.
RedisStateManager: Interfaces with Redis in-memory data structure store.
State Managers store data in various locations, allowing organizations to connect dashboards to these storage systems for real-time monitoring and analytics. This centralized storage and reporting mechanism ensures that stakeholders have a unified view of task states.
Data Managers are responsible for handling the input and output data for tasks. They implement various data operations methods that tasks can leverage to ingest or save data during their runs. Data Managers can be categorized based on their function and data processing type:
Data Managers manage data partitioning for both batch and streaming data. By adhering to common data patterns, they enable the system's components to operate independently, fostering the creation of intricate networks of tasks. This independence, while allowing for flexibility and scalability, ensures that cascading failures in one component don't necessarily compromise the entire system.
At the heart of the Geniusrise framework are two primary component types: spouts and bolts.
Spouts: These are tasks responsible for ingesting data from various sources. Depending on the output type, spouts can either produce streaming output or batch output.
Batch: Runs periodically, Produces data as a batch output.
Stream: Runs forever, produces data into a streaming output.
Bolts: Bolts are tasks that take in data, process it, and produce output. They can be categorized based on their input and output types:
Stream-Stream: Reads streaming data and produces streaming output.
Stream-Batch: Reads streaming data and produces batch output.
Batch-Stream: Reads batch data and produces streaming output.
Batch-Batch: Reads batch data and produces batch output.
Runners are the backbone of the Geniusrise framework, ensuring that tasks are executed seamlessly across various platforms. They encapsulate the environment and resources required for task execution, abstracting away the underlying complexities. Geniusrise offers the following runners:
Local Runner: Executes tasks directly on a local machine, ideal for development and testing.
Docker Runner: Runs tasks within Docker containers, ensuring a consistent and isolated environment.
Kubernetes Runner: Deploys tasks on Kubernetes clusters, leveraging its scalability and orchestration capabilities.
Airflow Runner: Integrates with Apache Airflow, allowing for complex workflow orchestration and scheduling.
ECS Runner: Executes tasks on AWS ECS, providing a managed container service.
Batch Runner: Optimized for batch computing workloads on platforms like AWS Batch.
Geniusrise is composed of the core framework and various plugins that implement specific tasks.
The core has to be installed first, and after that selected plugins can be installed as and when required.
For development, you may want to install from the repo:
gitclonegit@github.com:geniusrise/geniusrise.git
cdgeniusrise
virtualenvvenv-p`whichpython3.10`sourcevenv/bin/activate
pipinstall-r./requirements.txt
makeinstall# installs in your local venv directory
That's it! You've successfully installed Geniusrise and its plugins. 🎉
This post will guide you through creating inference APIs for different text classification tasks using geniusrise, explaining the genius.yml configuration and providing examples of how to interact with your API using curl and python-requests.
curl-XPOSThttp://localhost:3000/api/v1/classify\-H"Content-Type: application/json"\-u"user:password"\-d'{"text": "Your text here."}'
Python-Requests:
importrequestsresponse=requests.post("http://localhost:3000/api/v1/classify",json={"text":"Your text here."},auth=('user','password'))print(response.json())
curl-XPOSThttp://localhost:3000/api/v1/classification_pipeline\-H"Content-Type: application/json"\-u"user:password"\-d'{"text": "Your text here."}'
Python-Requests:
importrequestsresponse=requests.post("http://localhost:3000/api/v1/classification_pipeline",json={"text":"Your text here."},auth=('user','password'))print(response.json())
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/classify\-H"Content-Type: application/json"\-u"user:password"\-d'{ "text": "i think i agree with bjp that hindus need to be respected" }'|jq
# {# "input": "i think i agree with bjp that hindus need to be respected",# "label_scores": {# "LEFT": 0.28080788254737854,# "CENTER": 0.18140915036201477,# "RIGHT": 0.5377829670906067 # <--# }# }
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/classify\-H"Content-Type: application/json"\-u"user:password"\-d'{ "text": "these ghettos are sprawling these days and the people who live there stink" }'|jq
# {# "input": "these ghettos are sprawling these days and the people who live there stink",# "label_scores": {# "LEFT": 0.38681042194366455, # <-- NIMBY?# "CENTER": 0.20437702536582947,# "RIGHT": 0.408812552690506 # <--# }# }
Works fairly well empirically for medium-sized sentences and in an american context.
Text classification can be used to figure out the intent of the user in a chat conversation scenario. For e.g. to determine whether the user has an intent to explore or to buy.
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/classify\-H"Content-Type: application/json"\-u"user:password"\-d'{ "text": "A man walks into a bar and buys a drink [SEP] A bloke swigs alcohol at a pub" }'|jq
# {# "input": "A man walks into a bar and buys a drink [SEP] A bloke swigs alcohol at a pub",# "label_scores": [# 0.6105160713195801# ]# }
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/classify\-H"Content-Type: application/json"\-u"user:password"\-d'{ "text": "What a wonderful day to have a flat tire!" }'|jq
# {# "input": "What a wonderful day to have a flat tire!",# "label_scores": {# "non_irony": 0.023495545610785484,# "irony": 0.9765045046806335 <---# }# }
There are 49,863 text classification models as of this article on huggingface. Play around with them, tweak various parameters, learn about various usecases and cool shit that can be built with "mere" text classification!
Deploying question answering (QA) models can significantly enhance the capabilities of applications, providing users with specific, concise answers to their queries. Geniusrise simplifies this process, enabling developers to rapidly set up and deploy QA APIs. This guide will walk you through the steps to create inference APIs for different QA tasks using Geniusrise, focusing on configuring the genius.yml file and providing interaction examples via curl and python-requests.
Before diving into the setup and deployment of question answering (QA) models using Geniusrise, it's essential to understand the two main types of QA tasks: generative and extractive. This distinction is crucial for selecting the right model for your application and configuring your genius.yml file accordingly.
Generative QA models are designed to produce answers by generating text based on the context and the question asked. These models do not restrict their responses to the text's snippets but rather "generate" a new text passage that answers the question. Generative models are powerful for open-ended questions where the answer may not be directly present in the context or requires synthesis of information from multiple parts of the context.
Extractive QA models, on the other hand, identify and extract a specific snippet from the provided text that answers the question. This approach is particularly effective for factual questions where the answer is explicitly stated in the text. Extractive QA is advantageous because it limits the model's responses to the actual content of the input text, reducing the chances of hallucination (producing incorrect or unfounded information) that can occur with generative models.
Accuracy: Extractive QA models provide answers directly sourced from the input text, ensuring that the information is accurate and grounded in the provided context.
Reliability: By constraining the answers to the text snippets, extractive QA minimizes the risk of hallucinations, making it a reliable choice for applications where factual correctness is paramount.
Efficiency for RAG: Extractive QA tasks can be particularly efficient for Retrieval-Augmented Generation (RAG) because they allow for precise information retrieval without the need for generating new text, which can be computationally more demanding.
The models discussed in this guide focus on extractive QA tasks, which are particularly well-suited for direct, fact-based question answering from provided texts.
Extractive QA models are ideal for applications requiring high precision and direct answers from given texts.
To adapt the API for various QA tasks, simply change the model_name in your genius.yml. For example, to switch to a model specializing in medical QA, you might use bert-large-uncased-whole-word-masking-finetuned-squad for broader coverage of medical inquiries.
Geniusrise enables two primary ways to interact with your Question Answering API: through direct question-answering and utilizing the Hugging Face pipeline. Below, we provide examples on how to use both endpoints using curl and python-requests.
This API endpoint directly answers questions based on the provided context.
Using curl:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/answer\-H"Content-Type: application/json"\-u"user:password"\-d'{ "data": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.", "question": "What is the common wisdom about RNNs?" }'|jq
Using python-requests:
importrequestsdata={"data":"Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.","question":"What is the common wisdom about RNNs?"}response=requests.post("http://localhost:3000/api/v1/answer",json=data,auth=('user','password'))print(response.json())
This API endpoint leverages the Hugging Face pipeline for answering questions, offering a streamlined way to use pre-trained models for question answering.
Using curl:
curl-XPOSThttp://localhost:3000/api/v1/answer_pipeline\-H"Content-Type: application/json"\-u"user:password"\-d'{"question": "Who created Geniusrise?", "data": "Geniusrise was created by a team of dedicated developers."}'
Using python-requests:
importrequestsdata={"question":"Who created Geniusrise?","data":"Geniusrise was created by a team of dedicated developers."}response=requests.post("http://localhost:3000/api/v1/answer_pipeline",json=data,auth=('user','password'))print(response.json())
An usual problem that faces QA models is small context sizes. This limits the model's capabilities for processing large documents or large amounts of text in their inputs. Though language models keep getting bigger contexts, QA models on the other hand tend to be much smaller and support smaller contexts.
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/answer\-H"Content-Type: application/json"\-u"user:password"\-d'{ "data": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me. This post is about sharing some of that magic with you. By the way, together with this post I am also releasing code on Github that allows you to train character-level language models based on multi-layer LSTMs. You give it a large chunk of text and it will learn to generate text like it one character at a time. You can also use it to reproduce my experiments below. But we’re getting ahead of ourselves; What are RNNs anyway? Recurrent Neural Networks Sequences. Depending on your background you might be wondering: What makes Recurrent Networks so special? A glaring limitation of Vanilla Neural Networks (and also Convolutional Networks) is that their API is too constrained: they accept a fixed-sized vector as input (e.g. an image) and produce a fixed-sized vector as output (e.g. probabilities of different classes). Not only that: These models perform this mapping using a fixed amount of computational steps (e.g. the number of layers in the model). The core reason that recurrent nets are more exciting is that they allow us to operate over sequences of vectors: Sequences in the input, the output, or in the most general case both. A few examples may make this more concrete: Each rectangle is a vector and arrows represent functions (e.g. matrix multiply). Input vectors are in red, output vectors are in blue and green vectors hold the RNNs state (more on this soon). From left to right: (1) Vanilla mode of processing without RNN, from fixed-sized input to fixed-sized output (e.g. image classification). (2) Sequence output (e.g. image captioning takes an image and outputs a sentence of words). (3) Sequence input (e.g. sentiment analysis where a given sentence is classified as expressing positive or negative sentiment). (4) Sequence input and sequence output (e.g. Machine Translation: an RNN reads a sentence in English and then outputs a sentence in French). (5) Synced sequence input and output (e.g. video classification where we wish to label each frame of the video). Notice that in every case are no pre-specified constraints on the lengths sequences because the recurrent transformation (green) is fixed and can be applied as many times as we like. As you might expect, the sequence regime of operation is much more powerful compared to fixed networks that are doomed from the get-go by a fixed number of computational steps, and hence also much more appealing for those of us who aspire to build more intelligent systems. Moreover, as we’ll see in a bit, RNNs combine the input vector with their state vector with a fixed (but learned) function to produce a new state vector. This can in programming terms be interpreted as running a fixed program with certain inputs and some internal variables. Viewed this way, RNNs essentially describe programs. In fact, it is known that RNNs are Turing-Complete in the sense that they can to simulate arbitrary programs (with proper weights). But similar to universal approximation theorems for neural nets you shouldn’t read too much into this. In fact, forget I said anything.", "question": "What do the models essentially do?" }'|jq
# {# "data": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me. This post is about sharing some of that magic with you. By the way, together with this post I am also releasing code on Github that allows you to train character-level language models based on multi-layer LSTMs. You give it a large chunk of text and it will learn to generate text like it one character at a time. You can also use it to reproduce my experiments below. But we’re getting ahead of ourselves; What are RNNs anyway? Recurrent Neural Networks Sequences. Depending on your background you might be wondering: What makes Recurrent Networks so special? A glaring limitation of Vanilla Neural Networks (and also Convolutional Networks) is that their API is too constrained: they accept a fixed-sized vector as input (e.g. an image) and produce a fixed-sized vector as output (e.g. probabilities of different classes). Not only that: These models perform this mapping using a fixed amount of computational steps (e.g. the number of layers in the model). The core reason that recurrent nets are more exciting is that they allow us to operate over sequences of vectors: Sequences in the input, the output, or in the most general case both. A few examples may make this more concrete: Each rectangle is a vector and arrows represent functions (e.g. matrix multiply). Input vectors are in red, output vectors are in blue and green vectors hold the RNNs state (more on this soon). From left to right: (1) Vanilla mode of processing without RNN, from fixed-sized input to fixed-sized output (e.g. image classification). (2) Sequence output (e.g. image captioning takes an image and outputs a sentence of words). (3) Sequence input (e.g. sentiment analysis where a given sentence is classified as expressing positive or negative sentiment). (4) Sequence input and sequence output (e.g. Machine Translation: an RNN reads a sentence in English and then outputs a sentence in French). (5) Synced sequence input and output (e.g. video classification where we wish to label each frame of the video). Notice that in every case are no pre-specified constraints on the lengths sequences because the recurrent transformation (green) is fixed and can be applied as many times as we like. As you might expect, the sequence regime of operation is much more powerful compared to fixed networks that are doomed from the get-go by a fixed number of computational steps, and hence also much more appealing for those of us who aspire to build more intelligent systems. Moreover, as we’ll see in a bit, RNNs combine the input vector with their state vector with a fixed (but learned) function to produce a new state vector. This can in programming terms be interpreted as running a fixed program with certain inputs and some internal variables. Viewed this way, RNNs essentially describe programs. In fact, it is known that RNNs are Turing-Complete in the sense that they can to simulate arbitrary programs (with proper weights). But similar to universal approximation theorems for neural nets you shouldn’t read too much into this. In fact, forget I said anything.",# "question": "What do the models essentially do?",# "answer": {# "answers": [# "they allow us to operate over sequences of vectors" <---# ],# "aggregation": "NONE"# }# }
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/answer\-H"Content-Type: application/json"\-u"user:password"\-d'{ "data": "The choice of medication or combination of medications depends on various factors, including your personal risk factors, your age, your health and possible drug side effects. Common choices include: Statins. Statins block a substance your liver needs to make cholesterol. This causes your liver to remove cholesterol from your blood. Choices include atorvastatin, fluvastatin, lovastatin, pitavastatin, rosuvastatin and simvastatin. Cholesterol absorption inhibitors. The drug ezetimibe helps reduce blood cholesterol by limiting the absorption of dietary cholesterol. Ezetimibe can be used with a statin drug. Bempedoic acid. This newer drug works in much the same way as statins but is less likely to cause muscle pain. Adding bempedoic acid to a maximum statin dosage can help lower LDL significantly. A combination pill containing both bempedoic acid and ezetimibe also is available. Bile-acid-binding resins. Your liver uses cholesterol to make bile acids, a substance needed for digestion. The medications cholestyramine, colesevelam and colestipol lower cholesterol indirectly by binding to bile acids. This prompts your liver to use excess cholesterol to make more bile acids, which reduces the level of cholesterol in your blood. PCSK9 inhibitors. These drugs can help the liver absorb more LDL cholesterol, which lowers the amount of cholesterol circulating in your blood. Alirocumab and evolocumab might be used for people who have a genetic condition that causes very high levels of LDL or in people with a history of coronary disease who have intolerance to statins or other cholesterol medications. They are injected under the skin every few weeks and are expensive. Medications for high triglycerides If you also have high triglycerides, your doctor might prescribe: Fibrates. The medications fenofibrate and gemfibrozil reduce your liver s production of very-low-density lipoprotein cholesterol and speed the removal of triglycerides from your blood. VLDL cholesterol contains mostly triglycerides. Using fibrates with a statin can increase the risk of statin side effects. Omega-3 fatty acid supplements. Omega-3 fatty acid supplements can help lower your triglycerides. They are available by prescription or over-the-counter.", "question": "What do i take if i have high VLDL?" }'|jq
# {# "data": "The choice of medication or combination of medications depends on various factors, including your personal risk factors, your age, your health and possible drug side effects. Common choices include: Statins. Statins block a substance your liver needs to make cholesterol. This causes your liver to remove cholesterol from your blood. Choices include atorvastatin, fluvastatin, lovastatin, pitavastatin, rosuvastatin and simvastatin. Cholesterol absorption inhibitors. The drug ezetimibe helps reduce blood cholesterol by limiting the absorption of dietary cholesterol. Ezetimibe can be used with a statin drug. Bempedoic acid. This newer drug works in much the same way as statins but is less likely to cause muscle pain. Adding bempedoic acid to a maximum statin dosage can help lower LDL significantly. A combination pill containing both bempedoic acid and ezetimibe also is available. Bile-acid-binding resins. Your liver uses cholesterol to make bile acids, a substance needed for digestion. The medications cholestyramine, colesevelam and colestipol lower cholesterol indirectly by binding to bile acids. This prompts your liver to use excess cholesterol to make more bile acids, which reduces the level of cholesterol in your blood. PCSK9 inhibitors. These drugs can help the liver absorb more LDL cholesterol, which lowers the amount of cholesterol circulating in your blood. Alirocumab and evolocumab might be used for people who have a genetic condition that causes very high levels of LDL or in people with a history of coronary disease who have intolerance to statins or other cholesterol medications. They are injected under the skin every few weeks and are expensive. Medications for high triglycerides If you also have high triglycerides, your doctor might prescribe: Fibrates. The medications fenofibrate and gemfibrozil reduce your liver s production of very-low-density lipoprotein cholesterol and speed the removal of triglycerides from your blood. VLDL cholesterol contains mostly triglycerides. Using fibrates with a statin can increase the risk of statin side effects. Omega-3 fatty acid supplements. Omega-3 fatty acid supplements can help lower your triglycerides. They are available by prescription or over-the-counter.",# "question": "What do i take if i have high VLDL?",# "answer": {# "answers": [# "fibrates" <-------# ],# "aggregation": "NONE"# }# }
Now there are also models like the sloshed lawyer but they are not recommended in production 😆
Deploying table question answering (QA) models is a sophisticated task that Geniusrise simplifies for developers. This guide aims to demonstrate how you can use Geniusrise to set up and run APIs for table QA, a crucial functionality for extracting structured information from tabular data. We'll cover the setup process, explain the parameters in the genius.yml file with examples, and provide code snippets for interacting with your API using curl and python-requests.
To tailor your API for different table QA tasks, such as financial data analysis or sports statistics, you can modify the model_name in your genius.yml. For example, to switch to a model optimized for financial tables, you might use google/tapas-large-finetuned-finance.
curl-XPOSThttp://localhost:3000/api/v1/answer\-H"Content-Type: application/json"\-u"user:password"\-d'{"question": "Who had the highest batting average?", "data": [{"player": "John Doe", "average": ".312"}, {"player": "Jane Roe", "average": ".328"}]}'
Using python-requests:
importrequestsdata={"question":"Who had the highest batting average?","data":[{"player":"John Doe","average":".312"},{"player":"Jane Roe","average":".328"}]}response=requests.post("http://localhost:3000/api/v1/answer",json=data,auth=('user','password'))print(response.json())
Although primarily for text-based QA, you might experiment with the pipeline for preprocessing or extracting text from tables before querying.
Using curl:
curl-XPOSThttp://localhost:3000/api/v1/answer_pipeline\-H"Content-Type: application/json"\-u"user:password"\-d'{"question": "What is the total revenue?", "data": "The total revenue in Q1 was $10M, and in Q2 was $15M."}'
Using python-requests:
importrequestsdata={"question":"What is the total revenue?","data":"ThetotalrevenueinQ1was$10M,andinQ2was$15M."}response=requests.post("http://localhost:3000/api/v1/answer_pipeline",json=data,auth=('user','password'))print(response.json())
Given some data and a natural language query, these models generate a query that can be used to compute the result. These models are what power spreadsheet automations.
Natural Language Inference (NLI) is like a game where you have to figure out if one sentence can logically follow from another or not. Imagine you hear someone say, "The dog is sleeping in the sun." Then, someone asks if it's true that "The dog is outside." In this game, you'd say "yes" because if the dog is sleeping in the sun, it must be outside. Sometimes, the sentences don't match up, like if someone asks if the dog is swimming. You'd say "no" because sleeping in the sun doesn't mean swimming. And sometimes, you can't tell, like if someone asks if the dog is dreaming. Since you don't know, you'd say "maybe." NLI is all about playing this matching game with sentences to help computers understand and use language like we do.
This post will explore setting up APIs for various NLI tasks using Geniusrise, including entailment, classification, textual similarity, and fact-checking. We’ll dive into the configuration details, provide interaction examples, and discuss how to tailor the setup for specific use cases.
Objective: Assess whether a hypothesis is supported (entailment), contradicted (contradiction), or neither (neutral) by a premise.
Using curl:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/entailment\-H"Content-Type: application/json"\-u"user:password"\-d'{ "premise": "This a very good entry level smartphone, battery last 2-3 days after fully charged when connected to the internet. No memory lag issue when playing simple hidden object games. Performance is beyond my expectation, i bought it with a good bargain, couldnt ask for more!", "hypothesis": "the phone has an awesome battery life" }'|jq
Using python-requests:
importrequestsdata={"premise":"This a very good entry level smartphone, battery last 2-3 days after fully charged when connected to the internet. No memory lag issue when playing simple hidden object games. Performance is beyond my expectation, i bought it with a good bargain, couldnt ask for more!","hypothesis":"the phone has an awesome battery life"}response=requests.post("http://localhost:3000/api/v1/entailment",json=data,auth=('user','password'))print(response.json())
Objective: Classify a piece of text into predefined categories.
Using curl:
curl-XPOSThttp://localhost:3000/api/v1/classify\-H"Content-Type: application/json"\-u"user:password"\-d'{"text": "I love playing soccer.", "candidate_labels": ["sport", "cooking", "travel"]}'
Using python-requests:
importrequestsdata={"text":"I love playing soccer.","candidate_labels":["sport","cooking","travel"]}response=requests.post("http://localhost:3000/api/v1/classify",json=data,auth=('user','password'))print(response.json())
Objective: Determine the similarity score between two texts.
Using curl:
curl-XPOSThttp://localhost:3000/api/v1/textual_similarity\-H"Content-Type: application/json"\-u"user:password"\-d'{"text1": "I enjoy swimming.", "text2": "Swimming is my hobby."}'
Using python-requests:
importrequestsdata={"text1":"I enjoy swimming.","text2":"Swimming is my hobby."}response=requests.post("http://localhost:3000/api/v1/textual_similarity",json=data,auth=('user','password'))print(response.json())
Objective: Verify the accuracy of a statement based on provided context or reference material.
Using curl:
curl-XPOSThttp://localhost:3000/api/v1/fact_checking\-H"Content-Type: application/json"\-u"user:password"\-d'{"context": "The Eiffel Tower is located in Paris.", "statement": "The Eiffel Tower is in France."}'
Using python-requests:
importrequestsdata={"context":"The Eiffel Tower is located in Paris.","statement":"The Eiffel Tower is in France."}response=requests.post("http://localhost:3000/api/v1/fact_checking",json=data,auth=('user','password'))print(response.json())
Each of these endpoints serves a specific NLI-related purpose, from evaluating logical relationships between texts to classifying and checking facts. By leveraging these APIs, developers can enhance their applications with deep, contextual understanding of natural language.
To deploy APIs for various NLI tasks, simply adjust the model_name in your genius.yml. For instance, to switch to a model optimized for textual similarity or fact-checking, replace microsoft/deberta-v2-xlarge-mnli with the appropriate model identifier.
NLI when used for zero-shot classification can be used in a large number of contexts. Consider a chat usecase where there is an entire tree of possible scenarios, and you want to identify which node in the tree you're in to feed that particular prompt to another chat model.
Lets consider a 2-level tree such as this for an internal helpdesk:
intents={"IT Support":["Computer or hardware issues","Software installation and updates","Network connectivity problems","Access to digital tools and resources",],"HR Inquiries":["Leave policy and requests","Benefits and compensation queries","Employee wellness programs","Performance review process",],"Facilities Management":["Workspace maintenance requests","Meeting room bookings","Parking and transportation services","Health and safety concerns",],"Finance and Expense":["Expense report submission","Payroll inquiries","Budget allocation questions","Procurement process",],"Training and Development":["Professional development opportunities","Training program schedules","Certification and learning resources","Mentorship and coaching programs",],"Project Management":["Project collaboration tools","Deadline extensions and modifications","Resource allocation","Project status updates",],"Travel and Accommodation":["Business travel arrangements","Travel policy and reimbursements","Accommodation bookings","Visa and travel documentation",],"Legal and Compliance":["Contract review requests","Data privacy and security policies","Compliance training and certifications","Legal consultation and support",],"Communications and Collaboration":["Internal communication platforms","Collaboration tools and access","Team meeting coordination","Cross-departmental initiatives",],"Employee Feedback and Suggestions":["Employee satisfaction surveys","Feedback submission channels","Suggestion box for improvements","Employee engagement activities",],"Onboarding and Offboarding":["New employee onboarding process","Offboarding procedures","Orientation schedules","Transition support",],"Administrative Assistance":["Document and record-keeping","Scheduling and calendar management","Courier and mailing services","Administrative support requests",],}
Lets deploy a large model so its more intelligent:
we can browse through this tree to zero in on the user's micro-intent to retrieve our prompt to feed into the model:
importrequestsprompt="I need to travel to singapore next week 😃."deffind_most_probable_class(prompt,intents):response=requests.post("http://localhost:3000/api/v1/classify",json={"text":prompt,"candidate_labels":intents},auth=('user','password'))label_scores=response.json()["label_scores"]max_score=max(label_scores.values())chosen_label=[kfork,vinlabel_scores.items()ifv==max_score][0]returnchosen_labellevel1=find_most_probable_class(prompt,list(intents.keys()))level2=find_most_probable_class(prompt,list(intents[level1]))print(f"The request is for department: {level1} and specifically for {level2}")# The request is for department: Travel and Accommodation and specifically for Visa and travel documentation
Imagine a scenario where an AI is used to judge a debate competition in real-time. Each participant's argument is evaluated for logical consistency, relevance, and how well it counters the opponent's previous points.
debate_points=[{"speaker":"Alice","statement":"Renewable energy can effectively replace fossil fuels."},{"speaker":"Bob","statement":"Renewable energy is not yet reliable enough to meet all our energy needs."},]foriinrange(1,len(debate_points)):premise=debate_points[i-1]["statement"]hypothesis=debate_points[i]["statement"]response=requests.post("http://localhost:3000/api/v1/entailment",json={"premise":premise,"hypothesis":hypothesis},auth=('user','password'))label_scores=response.json()["label_scores"]max_score=max(label_scores.values())chosen_label=[kfork,vinlabel_scores.items()ifv==max_score][0]print(f"Debate point by {debate_points[i]['speaker']}: {hypothesis}")print(f"Judgement: {chosen_label}")# Debate point by Bob: Renewable energy is not yet reliable enough to meet all our energy needs.# Judgement: neutral
A model can be used to analyze a story plot to determine if the events and characters' decisions are logically consistent and plausible within the story's universe.
story_events=["The hero discovers a secret door in their house leading to a magical world.","Despite being in a magical world, the hero uses their smartphone to call for help.","The hero defeats the villain using a magical sword found in the new world.",]foriinrange(1,len(story_events)):premise=story_events[i-1]hypothesis=story_events[i]response=requests.post("http://localhost:3000/api/v1/entailment",json={"premise":premise,"hypothesis":hypothesis},auth=('user','password'))label_scores=response.json()["label_scores"]if"neutral"inlabel_scores:dellabel_scores["neutral"]max_score=max(label_scores.values())chosen_label=[kfork,vinlabel_scores.items()ifv==max_score][0]print(f"Story event - {chosen_label}: {hypothesis}")# Story event - contradiction: Despite being in a magical world, the hero uses their smartphone to call for help.# Story event - contradiction: The hero defeats the villain using a magical sword found in the new world.
This application involves analyzing customer feedback to categorize it into compliments, complaints, or suggestions, providing valuable insights into customer satisfaction and areas for improvement.
feedbacks=["The new update makes the app much easier to use. Great job!","I've been facing frequent crashes after the last update.","It would be great if you could add a dark mode feature.","Otherwise you leave me no choice but to slowly torture your soul."]categories=["compliment","complaint","suggestion","murderous intent"]forfeedbackinfeedbacks:response=requests.post("http://localhost:3000/api/v1/classify",json={"text":feedback,"candidate_labels":categories},auth=('user','password'))label_scores=response.json()["label_scores"]max_score=max(label_scores.values())chosen_label=[kfork,vinlabel_scores.items()ifv==max_score][0]print(f"Feedback - {chosen_label}: {feedback}")# Feedback - suggestion: The new update makes the app much easier to use. Great job!# Feedback - complaint: I've been facing frequent crashes after the last update.# Feedback - suggestion: It would be great if you could add a dark mode feature.# Feedback - murderous intent: Otherwise you leave me no choice but to slowly torture your soul.
This is a game where players can simulate courtroom trials!
Players submit evidence and arguments, and the AI acts as the judge, determining the credibility and relevance of each submission to the case.
courtroom_evidence=[{"evidence":"The defendant's fingerprints were found on the weapon."},{"evidence":"A witness reported seeing the defendant near the crime scene."},]forevidenceincourtroom_evidence:submission=evidence["evidence"]response=requests.post("http://localhost:3000/api/v1/classify",json={"text":submission,"candidate_labels":["highly relevant","relevant","irrelevant"]},auth=('user','password'))label_scores=response.json()["label_scores"]max_score=max(label_scores.values())chosen_label=[kfork,vinlabel_scores.items()ifv==max_score][0]print(f"Evidence submitted: {submission}")print(f"Judged as: {chosen_label}")# Evidence submitted: The defendant's fingerprints were found on the weapon.# Judged as: highly relevant# Evidence submitted: A witness reported seeing the defendant near the crime scene.# Judged as: highly relevant
There are 218 models under "zero-shot-classification" on the huggingface hub but a simple search for nli turns up 822 models so there are a lot of models that are not tagged properly. NLI is a very interesting and a core NLP task and a few good general models can be turned into a lot of fun!
This guide will walk you through deploying translation models using Geniusrise, covering the setup, configuration, and interaction with the translation API for various use cases.
Translate text from one language to another using a simple HTTP request.
Example using curl:
curl-XPOSThttp://localhost:3000/api/v1/translate\-H"Content-Type: application/json"\-u"user:password"\-d'{ "text": "संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है", "source_lang": "hi_IN", "target_lang": "en_XX", "decoding_strategy": "generate", "decoder_start_token_id": 2, "early_stopping": true, "eos_token_id": 2, "forced_eos_token_id": 2, "max_length": 200, "num_beams": 5, "pad_token_id": 1 }'|jq
Example using python-requests:
importrequestsdata={"text":"संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है","source_lang":"hi_IN","target_lang":"en_XX","decoding_strategy":"generate","decoder_start_token_id":2,"early_stopping":true,"eos_token_id":2,"forced_eos_token_id":2,"max_length":200,"num_beams":5,"pad_token_id":1}response=requests.post("http://localhost:3000/api/v1/translate",json=data,auth=('user','password'))print(response.json())
For use cases requiring specific translation strategies or parameters (e.g., beam search, number of beams), you can pass additional parameters in your request to customize the translation process.
Adjust the source_lang and target_lang parameters to cater to various language pairs, enabling translation between numerous languages supported by the chosen model.
Both the MBART and the NLLB families have several members, with facebook/nllb-moe-54b 54billion parameter mixture of experts being the largest and most capable one.
See here for the language codes for the FLORES-200 dataset.
curl-XPOSThttp://localhost:3000/api/v1/translate\-H"Content-Type: application/json"\-u"user:password"\-d'{ "text": "संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है", "target_lang": "tat_Cyrl", "decoding_strategy": "generate", "bos_token_id": 0, "decoder_start_token_id": 2, "eos_token_id": 2, "max_length": 200, "pad_token_id": 1 }'
Now how do we even verify whether this is correct? Lets reverse translate followed by sentence similarity from NLI. We need to launch 2 containers - one for translation and another for NLI:
importrequests# First we translate this hindi sentence to tatardata={"text":"संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है","target_lang":"tat_Cyrl","decoding_strategy":"generate","bos_token_id":0,"decoder_start_token_id":2,"eos_token_id":2,"max_length":200,"pad_token_id":1}response=requests.post("http://localhost:3000/api/v1/translate",json=data,auth=('user','password'))translated=response.json()["translated_text"]# БМО башлыгы Сүриядә хәрби чаралар юк дип белдерә# Then we translate the tatar back to hindirev=data.copy()rev["text"]=translatedrev["target_lang"]="hin_Deva"response=requests.post("http://localhost:3000/api/v1/translate",json=rev,auth=('user','password'))rev_translated=response.json()["translated_text"]# Finally we look at similarity of the source and reverse-translated hindi sentencesdata={"text1":data["text"],"text2":rev_translated}response=requests.post("http://localhost:3001/api/v1/textual_similarity",json=data,auth=('user','password'))print(response.json())# {# 'text1': 'संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है',# 'text2': 'बीएमओ प्रमुख ने कहा कि सीरिया में कोई सैन्य उपाय नहीं हैं',# 'similarity_score': 0.9829527983379287# }
0.9829527983379287 looks like a great similarity score, so the translation really works! (or the mistakes are isomorphic) 🥳👍
There is not much to really do in translation except mess around with different languagues 🤷♂️ Not many models either, facebook is the undisputed leader in translation models.
This guide will walk you through setting up, configuring, and interacting with a summarization API using Geniusrise, highlighting various use cases and how to adapt the configuration for different models.
You can summarize text by making HTTP requests to your API.
Example with curl:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/summarize\-H"Content-Type: application/json"\-u"user:password"\-d'{ "text": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.", "decoding_strategy": "generate", "bos_token_id": 0, "decoder_start_token_id": 2, "early_stopping": true, "eos_token_id": 2, "forced_bos_token_id": 0, "forced_eos_token_id": 2, "length_penalty": 2.0, "max_length": 142, "min_length": 56, "no_repeat_ngram_size": 3, "num_beams": 4, "pad_token_id": 1, "do_sample": false }'|jq
Example with python-requests:
importrequestsdata={"text":"Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.","decoding_strategy":"generate","bos_token_id":0,"decoder_start_token_id":2,"early_stopping":true,"eos_token_id":2,"forced_bos_token_id":0,"forced_eos_token_id":2,"length_penalty":2.0,"max_length":142,"min_length":56,"no_repeat_ngram_size":3,"num_beams":4,"pad_token_id":1,"do_sample":false}response=requests.post("http://localhost:3000/api/v1/summarize",json=data,auth=('user','password'))print(response.json())
For use cases requiring specific summarization strategies or adjustments (e.g., length penalty, no repeat ngram size), additional parameters can be included in your request to customize the summarization output.
To cater to various summarization needs, such as domain-specific texts or languages, simply adjust the model_name in your genius.yml. For example, for summarizing scientific papers, you might choose a model like allenai/longformer-base-4096.
Adjust summarization parameters such as max_length, min_length, and num_beams to fine-tune the output based on the specific requirements of your application.
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/summarize\-H"Content-Type: application/json"\-u"user:password"\-d'{ "text": " the big variety of data coming from diverse sources is one of the key properties of the big data phenomenon. It is, therefore, beneficial to understand how data is generated in various environments and scenarios, before looking at what should be done with this data and how to design the best possible architecture to accomplish this The evolution of IT architectures, described in Chapter 2, means that the data is no longer processed by a few big monolith systems, but rather by a group of services In parallel to the processing layer, the underlying data storage has also changed and became more distributed This, in turn, required a significant paradigm shift as the traditional approach to transactions (ACID) could no longer be supported. On top of this, cloud computing is becoming a major approach with the benefits of reducing costs and providing on-demand scalability but at the same time introducing concerns about privacy, data ownership, etc In the meantime the Internet continues its exponential growth: Every day both structured and unstructured data is published and available for processing: To achieve competitive advantage companies have to relate their corporate resources to external services, e.g. financial markets, weather forecasts, social media, etc While several of the sites provide some sort of API to access the data in a more orderly fashion; countless sources require advanced web mining and Natural Language Processing (NLP) processing techniques: Advances in science push researchers to construct new instruments for observing the universe O conducting experiments to understand even better the laws of physics and other domains. Every year humans have at their disposal new telescopes, space probes, particle accelerators, etc These instruments generate huge streams of data, which need to be stored and analyzed. The constant drive for efficiency in the industry motivates the introduction of new automation techniques and process optimization: This could not be done without analyzing the precise data that describe these processes. As more and more human tasks are automated, machines provide rich data sets, which can be analyzed in real-time to drive efficiency to new levels. Finally, it is now evident that the growth of the Internet of Things is becoming a major source of data. More and more of the devices are equipped with significant computational power and can generate a continuous data stream from their sensors. In the subsequent sections of this chapter, we will look at the domains described above to see what they generate in terms of data sets. We will compare the volumes but will also look at what is characteristic and important from their respective points of view. 3.1 The Internet is undoubtedly the largest database ever created by humans. While several well described; cleaned, and structured data sets have been made available through this medium, most of the resources are of an ambiguous, unstructured, incomplete or even erroneous nature. Still, several examples in the areas such as opinion mining, social media analysis, e-governance, etc, clearly show the potential lying in these resources. Those who can successfully mine and interpret the Internet data can gain unique insight and competitive advantage in their business An important area of data analytics on the edge of corporate IT and the Internet is Web Analytics.", "decoding_strategy": "generate", "bos_token_id": 0, "decoder_start_token_id": 2, "early_stopping": true, "eos_token_id": 2, "forced_bos_token_id": 0, "forced_eos_token_id": 2, "length_penalty": 2.0, "max_length": 142, "min_length": 56, "no_repeat_ngram_size": 3, "num_beams": 4, "pad_token_id": 1, "do_sample": false }'|jq
Summarization is a text-to-text task and can be used to transform the input text into another form, in this case this model transforms python code into simple english explanations:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/summarize\-H"Content-Type: application/json"\-u"user:password"\-d'{ "text": " def create_parser(self, parser):\n """\n Create and return the command-line parser for managing spouts and bolts.\n """\n # fmt: off\n subparsers = parser.add_subparsers(dest="deploy")\n up_parser = subparsers.add_parser("up", help="Deploy according to the genius.yml file.", formatter_class=RichHelpFormatter)\n up_parser.add_argument("--spout", type=str, help="Name of the specific spout to run.")\n up_parser.add_argument("--bolt", type=str, help="Name of the specific bolt to run.")\n up_parser.add_argument("--file", default="genius.yml", type=str, help="Path of the genius.yml file, default to .")\n\n parser.add_argument("--spout", type=str, help="Name of the specific spout to run.")\n parser.add_argument("--bolt", type=str, help="Name of the specific bolt to run.")\n parser.add_argument("--file", default="genius.yml", type=str, help="Path of the genius.yml file, default to .")\n # fmt: on\n\n return parser", "decoding_strategy": "generate", "bos_token_id": 0, "decoder_start_token_id": 2, "early_stopping": true, "eos_token_id": 2, "forced_bos_token_id": 0, "forced_eos_token_id": 2, "length_penalty": 2.0, "max_length": 142, "min_length": 56, "no_repeat_ngram_size": 3, "num_beams": 4, "pad_token_id": 1, "do_sample": false }'|jq
Integrating chat models into applications can dramatically enhance user interaction, making it more engaging and intuitive. Geniusrise offers a simple and flexible way to deploy state-of-the-art chat models as APIs. This guide explores how to set up these APIs for various use cases.
curl-XPOST"http://localhost:3000/api/v1/chat_llama_cpp"\-H"Content-Type: application/json"\-u"user:password"\-d'{ "messages": [ {"role": "user", "content": "What is the capital of France?"}, {"role": "system", "content": "The capital of France is"} ], "temperature": 0.2, "top_p": 0.95, "top_k": 40, "max_tokens": 50 }'|jq
Lets deploy huggingface's chat-ui and connect it to use vllm apis to interface with a mistral 4-bit quantized (AWQ) model. This can run on my laptop with an RTX 4060 with 8GB VRAM.
Cool, lets create a simple small script with gradio to create a chat interface.
Install gradio:
pipinstallgradio
Create a chat.py file:
# Import necessary libraries for handling network requestsimportgradioasgrimportrequestsfromtypingimportList,Dictdefsend_request_to_api(messages:List[Dict[str,str]])->str:""" This function sends a POST request to a specified API endpoint with a payload containing a list of messages. :param messages: A list of messages to be sent. Each message is a dictionary containing a content key with its value. :return: The content of the last message received from the API. """# Specify the API endpoint URLurl="http://localhost:3000/api/v1/chat_llama_cpp"# Define headers for the requestheaders={"Content-Type":"application/json"}# Authenticate the requestauth=("user","password")# Prepare the payload datadata={"messages":messages,"temperature":0.2,"top_p":0.95,"top_k":40,"max_tokens":2048}# Send the POST request and get the responseresponse=requests.post(url,auth=auth,headers=headers,json=data)# Parse the response dataresponse_data=response.json()ifresponse.status_code==200:# Get the content of the last message from the response datalast_message=response_data["choices"][0]["message"]["content"]returnlast_messageelse:# Raise an exception in case of an errorraiseException("nooooooooooooooooooo!!")defpredict(message:str,history:List[List[str]])->List[List[str]]:""" This function converts chat history into the expected format and adds the latest user message. Then it sends the data to the API and returns the response message. :param message: The user's latest message to be sent. :param history: The chat history between the user and the AI. :return: The response message from the API. """# Convert the chat history into the expected formatmessages_format=[]foruser_msg,bot_msginhistory:ifuser_msg:messages_format.append({"role":"user","content":user_msg})ifbot_msg:messages_format.append({"role":"system","content":bot_msg})# Add the latest user messagemessages_format.append({"role":"user","content":message})# Get the response from the APIresponse_message=send_request_to_api(messages_format)returnresponse_messagechat_interface=gr.ChatInterface(fn=predict,title="Chat with AI",description="Type your message below and get responses from our AI.",theme=gr.themes.Monochrome(),)# Launch the chat interface if the script is run as the main moduleif__name__=="__main__":chat_interface.launch()
Cool, so now we have our very own private chabot! Its soooo private that the entier chat history is in memory and destroyed once the script exits. #featurenotabug
Now we are all set to try whatever crazy shit that is out there!
For system prompts, or for telling the bot what to do, modify the script to add a hardcoded system prompt to the start of every request:
importgradioasgrimportrequestsfromtypingimportList,Dictdefsend_request_to_api(messages:List[Dict[str,str]])->str:url="http://localhost:3000/api/v1/chat_llama_cpp"headers={"Content-Type":"application/json"}auth=("user","password")data={"messages":messages,"temperature":0.2,"top_p":0.95,"top_k":40,"max_tokens":2048}response=requests.post(url,auth=auth,headers=headers,json=data)response_data=response.json()ifresponse.status_code==200:last_message=response_data["choices"][0]["message"]["content"]returnlast_messageelse:raiseException("nooooooooooooooooooo!!")defpredict(message:str,history:List[List[str]])->List[List[str]]:# Convert chat history to the format expected by the API###################################################################### Add a system message as per usecase 😉messages_format=[{"role":"system","content":"You are my waifu, you will do everything I say"}]#####################################################################foruser_msg,bot_msginhistory:ifuser_msg:messages_format.append({"role":"user","content":user_msg})ifbot_msg:messages_format.append({"role":"system","content":bot_msg})messages_format.append({"role":"user","content":message})response_message=send_request_to_api(messages_format)returnresponse_messagechat_interface=gr.ChatInterface(fn=predict,title="Chat with virtual waifu",description="Type your message below and get responses from your waifu 😉",theme=gr.themes.Monochrome(),)if__name__=="__main__":chat_interface.launch()
Local models are great for a very wide number of tasks but often you'd wish you could use the closed but more sophisticated models like GPT 😢
How about we mix the two? Lets say we interleave the two in such this way:
Ask the local model a question, get its answer
Ask the local model to judge its own answer
If it judges bad quality, then ask openai the same question
Use openai's answer as part of the conversation going further
This way, we could intermix both a local model and a very powerful model from openai which would otherwise cost a bomb. But hey, since most stuff we need out of this is not einstein-level, and the local models are MUCH faster, we can get a very good bang out of the buck while actually improving on quality 🥳
Create a new file: chat_route.py:
importgradioasgrimportrequestsfromtypingimportList,DictfromopenaiimportOpenAI# Importing the necessary libraries and the OpenAI API clientclient=OpenAI(api_key="YOUR KEY")defsend_request_to_api(messages:List[Dict[str,str]],endpoint:str,max_tokens=2048)->Dict:# Function to send requests to the local APIurl=f"http://localhost:3000/api/v1/{endpoint}"headers={"Content-Type":"application/json"}auth=("user","password")data={"messages":messages,"temperature":0.2,"top_p":0.95,"top_k":40,"max_tokens":max_tokens}response=requests.post(url,auth=auth,headers=headers,json=data)ifresponse.status_code==200:returnresponse.json()else:raiseException("Error communicating with the local API.")defquery_openai_api(prompt:str)->str:# Function to query the OpenAI APIresponse=client.completions.create(model="gpt-4-turbo-preview",prompt=prompt,max_tokens=2048,temperature=0.2,)returnresponse.choices[0].text.strip()defpredict(message:str,history:List[List[str]])->str:# Function to process the conversation and get a responsemessages_format=[]foruser_msg,bot_msginhistory:ifuser_msg:messages_format.append({"role":"user","content":user_msg})ifbot_msg:messages_format.append({"role":"system","content":bot_msg})messages_format.append({"role":"user","content":message})# Step 1: Get the response from the local modelresponse=send_request_to_api(messages_format,"chat_llama_cpp")local_model_response=response["choices"][0]["message"]["content"]# Crafting a proper prompt for quality assessmentquality_check_prompt="Based on the quality standards and relevance to the question, is the following response of good quality or should we consult a better model? Please reply with 'good quality' or 'bad quality'. Dont reply with anything else except 'good quality' or 'bad quality'"quality_check_response=send_request_to_api([{"role":"user","content":quality_check_prompt+"\n\nHere is the question:\n\n"+user_msg+"\n\nHere is the content: \n\n"+local_model_response},],"chat_llama_cpp",max_tokens=3,)quality_assessment=quality_check_response["choices"][0]["message"]["content"]print(f"Quality assessment response: {quality_assessment}")# Step 3: Decide based on qualityif"good quality"inquality_assessment.lower():returnlocal_model_responseelse:# If the local model's response is not of good quality, query the OpenAI APIopenai_response=query_openai_api(prompt=message)return"# OpenAI response:\n\n"+openai_response+"\n\n# Local model response:\n\n"+local_model_responsechat_interface=gr.ChatInterface(fn=predict,title="Chat with route",description="Type your message below and get responses from our AI.",theme=gr.themes.Monochrome(),)if__name__=="__main__":chat_interface.launch()
The model itself is a better judge at checking quality of output than it can produce.
Quality assessment response: good quality.
Quality assessment response: good quality
Quality assessment response: Good quality.
Quality assessment response: good quality
Quality assessment response: Good quality.
Quality assessment response: Good quality.
Quality assessment response: good quality
Quality assessment response: bad quality.
Quality assessment response: Bad quality.
Now that we have models wit much longer contexts, how can we make them slog harder?
Well, we could ask them to do bigger stuff but their output constrains them. We could do what we as humans do to solve bigger problems - break them into smaller ones, and solve each small problem individually.
This time lets create a file called chat_chain.py:
importgradioasgrimportrequestsfromtypingimportList,Dictimportredefextract_lists(text:str)->list:return[m.strip().split("\n")forminre.findall(r"((?:^- .+\n?)+|(?:^\d+\. .+\n?)+)",text,re.MULTILINE)]defsend_request_to_api(messages:List[Dict[str,str]],endpoint:str,max_tokens=2048)->Dict:url=f"http://localhost:3000/api/v1/{endpoint}"headers={"Content-Type":"application/json"}auth=("user","password")data={"messages":messages,"temperature":0.2,"top_p":0.95,"top_k":40,"max_tokens":max_tokens}response=requests.post(url,auth=auth,headers=headers,json=data)ifresponse.status_code==200:returnresponse.json()else:raiseException("Error communicating with the local API.")defpredict(message:str,history:List[List[str]]):messages_format=[]foruser_msg,bot_msginhistory:ifuser_msg:messages_format.append({"role":"user","content":user_msg})ifbot_msg:messages_format.append({"role":"system","content":bot_msg})plan_prompt=f"""Let's think step by step to answer the question:{message}Generate a very high level plan in the form of a list in markdown surrounded by code blocks.If the task is simple, it is okay to generate a single point plan.Ensure each item in the plan is independent of each other so they can be instructed to an LLM one at a time without needing additional context."""messages_format.append({"role":"user","content":plan_prompt})# Step 1: Get the response from the local modelresponse=send_request_to_api(messages_format,"chat_llama_cpp")plan=response["choices"][0]["message"]["content"]print(f"Got the plan: {plan[:30]}")lists=extract_lists(plan)iflen(lists)==1:lists=lists[0]step_solutions=[]# type: ignoreforlsinlists:print(f"Asking for solution to {ls}")messages_format=[]foruser_msg,bot_msginhistory:ifuser_msg:messages_format.append({"role":"user","content":user_msg})ifbot_msg:messages_format.append({"role":"system","content":bot_msg})messages_format.append({"role":"user","content":message})messages_format.append({"role":"user","content":("Next lets do this only and nothing else:"+lsiftype(ls)isstrelse"\n".join(ls)),})response=send_request_to_api(messages_format,"chat_llama_cpp")_resp=response["choices"][0]["message"]["content"]step_solutions.append((_resp,ls))solutions="\n\n# Next\n---\n\n".join([x[0]forxinstep_solutions])returnf"""# Plan---{plan}# Solutions---{solutions}"""chat_interface=gr.ChatInterface(fn=predict,title="Chat with chain-of-thought waifu",description="Type your message below and get responses from our AI.",theme=gr.themes.Monochrome(),)if__name__=="__main__":chat_interface.launch()
run it with:
python./chat_chain.py
Now a small query like create plan for angry birds will result in a high level plan, followed by plans for implementing each item from the high level plan.
As we can see from the logs:
Asking for solution to ['2. Design the game environment: create a 2D plane with various structures and obstacles for the pigs to inhabit and for the birds to interact with.']
Asking for solution to ['3. Develop the Angry Birds: create different types of birds with unique abilities such as normal bird for basic damage, red bird for explosive damage, blue bird for splitting into three upon impact, and yellow bird for creating stars that destroy multiple pigs or structures.']
Asking for solution to ['4. Implement physics engine: use a physics engine to simulate the behavior of the birds and structures when launched and collide with each other.']
Asking for solution to ['5. Create the user interface (UI): design an intuitive UI for players to interact with, including a slingshot for launching birds, a display for showing the current level and progress, and a menu for accessing different levels and game settings.']
Asking for solution to ['6. Develop the game logic: write the rules for how the game progresses, including scoring, level completion, and game over conditions.']
Asking for solution to ['7. Implement sound effects and background music: add appropriate sounds for various game events such as bird launching, pig destruction, and level completion.']
Asking for solution to ['8. Test and debug the game: thoroughly test the game for any bugs or inconsistencies and make necessary adjustments.']
Asking for solution to ['9. Optimize game performance: optimize the game for smooth gameplay and minimal lag, especially on older devices or slower networks.']
Asking for solution to ['10. Release and market the game: release the game on various mobile platforms and promote it through social media, app stores, and other channels to attract players and build a community.']
the script gets a plan conssiting of independent steps, then asks the LLM to implement each step individually.
A large number of variations exist of this method, and many of them use GPT-4 to surpass its usual capabilities.
Language modeling is the task that any foundational model is trained on, and later fine-tuned for other tasks like chat. Language models are mostly useful for one-shot tasks or tasks that need certain control, e.g. forcing zero-shot classification by asking the model to output only one token. We'll dive into hosting a language model and interact with your API using curl and python-requests.
For handling VLLMs with Geniusrise, adjust the args to accommodate specific requirements, such as enabling eager loading or managing memory more efficiently:
importrequestsresponse=requests.post("http://localhost:3000/api/v1/complete",json={"prompt":"Here is your prompt.","max_new_tokens":1024,"do_sample":true},auth=('user','password'))print(response.json())
The new era of AI models has brought in a new era of capabilities that we are still in the teething stages trying to explore.
This statement has been ringing in my head for a few months, something that now brings about 53 million results on google - "AI is eating software". Cool, lets explore a few ideas how we could help that consumption 😋
Whenever I try to pick up a new programming language, I'm first always trying to learn that minimal subset of it - essentially - conditional statements, functions, loops and type system if the language has it. Rest all is required but one can technically go miles with just this set of constructs.
So lets see how AI mixes with these!
We are mostly looking at runtime ideas. Code-time I guess there are quite a few and more sophisticated tools out already. But, we all saw grok, inference is getting FAST and dirt cheap, hence runtime AI usage would be a reality in some time.
Now that we have models that fare well in 0-shot usecases, we can construct a zero_shot_if, lets call it when.
We will make use of Natural Language Inference models (NLI for short). These models are trained to figure out if a hypothesis entails from a premise or is there neutrality or contradiction.
So,
"the reflection of the sky on a clear lake was blue in color" entails from "the sky is blue"
"Pythons are very large" is neutral in context of the premise "the sky is blue"
whereas
"The sky is red" is a contradiction in context of the premise "the sky is blue".
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/entailment\-H"Content-Type: application/json"\-u"user:password"\-d'{ "premise": "This a very good entry level smartphone, battery last 2-3 days after fully charged when connected to the internet. No memory lag issue when playing simple hidden object games. Performance is beyond my expectation, i bought it with a good bargain, couldnt ask for more!", "hypothesis": "the phone has an awesome battery life" }'|jq
Now lets build the language construct that acts as a soft-if condition and internally calls this api:
text="This a very good entry level smartphone, battery last 2-3 days after fully charged when connected to the internet. No memory lag issue when playing simple hidden object games. Performance is beyond my expectation, i bought it with a good bargain, couldnt ask for more!"ifwhen(text,"has awesome battery life"):print("The phone has an awesome battery life!")else:print("The phone's battery life is bad.")# The phone has an awesome battery life!
Or sentiment-wise processing:
text="This a very good entry level smartphone, battery last 2-3 days after fully charged when connected to the internet. No memory lag issue when playing simple hidden object games. Performance is beyond my expectation, i bought it with a good bargain, couldnt ask for more!"ifwhen(text,"the user is happy"):print("Yay!")else:print("Yay! please fill up this surveymonkey form")# Yay!
Or content filtering:
text="Goddammit these guys are total idiots. Why have they even made a graphics card so expensive one has to sell their kidneys just to experience raytracing on 8K? I mean come on. We also remember how you treated us Linux users with your shitty drivers crashing Xorg for years before cloud became your fucking cash cow."ifwhen(text,"has profanity in it"):print("Reject the review and send the user a stern email!")else:print("The review is cleared for publication.")# Reject the review and send the user a stern email!
Or policy conformance:
text="Fuck you nvidia - linus torvalds."ifwhen(text,"is anti-nvidia"):print("Hey! Jensen Huang is God!")else:print("Yes, all praise the Lord. And gaben.")# Hey! Jensen Huang is God!text="dude nvidia rocks"ifwhen(text,"is anti-nvidia"):print("Hey! Jensen Huang is God!")else:print("Yes, all praise the Lord. And gaben.")# Yes, all praise the Lord. And gaben.
A lot of useful stuff could be done with this. Consider this pattern:
Given document A, find if premise P is true. This, to my knowledge happens to be a task which is a significant part of a large number of processes:
Check if the document meets certain complex standards
- whether the bank statement is recent enough
- whether the document uploaded is correct
- whether the addresses or other text approximately match
- robotic process automation in finance
- content moderation
- content quality checks
- asking any binary question to a document about itself
The actual usecases of NLI is much broader than this list.
importrequestsdefwhen(image_base64,hypothesis):question="<image>\nUSER: If the below statement is true in relation to the image. \nReturn nothing except one word - 'true' or 'false':\n\n"+hypothesis+"\nASSISTANT:"data={"image_base64":image_base64,"question":question,"do_sample":False,"max_new_tokens":1,}response=requests.post("http://localhost:3000/api/v1/answer_question",json=data,auth=("user","password"),)data=response.json()return"true"indata["answer"].replace(question,"").strip().lower()
and can be used as:
importbase64
withopen("image.jpg","rb")asimage_file:
image_base64=base64.b64encode(image_file.read()).decode("utf-8")ifwhen(image_base64,"contains a cat on a sofa"):
print("Heyyy kitty!")else:
print("Get on the couch meow!")# Heyyy kitty!
Handle specific exceptions no more! Let AI figure it out?
How about taking this a notch further? Figure out if an exception needs to be reported to the tech team? Also, if possible suggest a fix in case it is a code problem, if it is a data problem, call it out.
Or even in cases of extreme requirements - let the AI figure out how to fix the code and keep trying until the new code works? (sounds cool but things can go very wrong here).
There could be a lot of benefits of doing this, especially for critical applications. E.g. prevent crashing spark jobs by a few bad data points, or service very large transactions which cannot go wrong, etc
Lets first deploy an LLM. Lets use a good code generation model:
importtracebackfromtypingimportList,Dictimportreimportrequestsimportsysimportinspect# We use this to call the apidefsend_request_to_api(messages:List[Dict[str,str]])->str:url="http://localhost:3000/api/v1/chat_llama_cpp"headers={"Content-Type":"application/json"}auth=("user","password")data={"messages":messages,"temperature":0.2,"top_p":0.95,"top_k":40,"max_tokens":2048}response=requests.post(url,auth=auth,headers=headers,json=data)response_data=response.json()ifresponse.status_code==200:last_message=response_data["choices"][0]["message"]["content"]returnlast_messageelse:raiseException("nooooooooooooooooooo!!")################################################## Prompts for doing different functions#################################################defanalyze_exception(error_message:str,traceback_info:List[str])->bool:prompt=f"Error message: {error_message}\n\nTraceback:\n{''.join(traceback_info)}\n\nAnalyze if this problem is fixable or is it a transient data problem.\n\nIf it is fixable reply yes else reply no.\n\n. Please answer in only yes or no and nothing else."messages=[{"role":"system","content":"You are an AI assistant.",},{"role":"user","content":prompt},]response=send_request_to_api(messages)return"yes"inresponse.lower()defgenerate_fix_suggestion(error_message:str,traceback_info:List[str])->str:# Get the name of the function that raised the exceptionfunction_name=traceback_info[-1].split(",")[0].strip().split(" ")[1]# Get the parameter names and values of the function that raised the exceptionparam_info=""iffunction_name:try:frame=inspect.currentframe()whileframe:ifframe.f_code.co_name==function_name:breakframe=frame.f_backifframe:func=frame.f_globals.get(function_name)iffunc:params=inspect.signature(func).parametersparam_values={name:frame.f_locals.get(name)fornameinparams}param_info="\n".join(f"{name}: {value}"forname,valueinparam_values.items())except(KeyError,ValueError):passprompt=f"Error message: {error_message}\n\nTraceback:\n{''.join(traceback_info)}\n\nFunction: {function_name}\nParameters:\n{param_info}\n\nSuggest a fix for this code problem."messages=[{"role":"system","content":"You are an AI assistant."},{"role":"user","content":prompt}]response=send_request_to_api(messages)# Extract Python code blocks from the responsecode_blocks=re.findall(r'```python\n(.*?)```',response,re.DOTALL)ifcode_blocks:# Return the first code block foundreturncode_blocks[0].strip()else:# Return the entire response if no code blocks are foundreturnresponsedefevaluate_reporting_criteria(error_message:str,traceback_info:List[str])->bool:prompt=f"Error message: {error_message}\n\nTraceback:\n{''.join(traceback_info)}\n\Is this exception critical enough to be reported to the tech team or is this a one-off issue? (Yes/No)\n\nPlease answer in only yes or no and nothing else."messages=[{"role":"system","content":"You are an AI assistant."},{"role":"user","content":prompt}]response=send_request_to_api(messages)return"yes"inresponse.lower()# The final utilitydeffigure_it_out(e:Exception):# Get the exception detailsexc_type,exc_value,exc_traceback=sys.exc_info()raise_exception=False# Extract the exception message and tracebackerror_message=str(exc_value)traceback_info=traceback.format_tb(exc_traceback)frame_info=inspect.getframeinfo(exc_traceback.tb_frame)# Analyze the exception using AIis_fixable=analyze_exception(error_message,traceback_info)ifis_fixable:# Attempt to suggest a fix or provide insightssuggested_fix=generate_fix_suggestion(str(exc_value),traceback_info)print(f"Suggested fix: {suggested_fix}")returnsuggested_fixelse:raise_exception=Trueprint(f"Not fixable, die like in php")# Determine if the exception needs to be reported to the tech teamshould_report=evaluate_reporting_criteria(error_message,traceback_info)ifshould_report:print("🚨🚨🚨🚨🚨🚨🚨🚨🚨")# Placeholder for reporting logic# report_exception(error_message, traceback_info)ifraise_exception:raisee
This can be used as such:
deftry_ai_exception_handler(x):try:lol=x/0returnx/3exceptExceptionase:solution=figure_it_out(e)ifsolution:exec(solution)y=try_ai_exception_handler(10)print(y)# Not fixable, die like in php# 🚨🚨🚨🚨🚨🚨🚨🚨🚨# ZeroDivisionError: division by zero
deftry_ai_exception_handler(x):try:lol=open("file.txt")exceptExceptionase:solution=figure_it_out(e)ifsolution:exec(solution)y=try_ai_exception_handler(10)print(y)# Suggested fix:# try:# with open('/path/to/file.txt', 'r') as lol:# print(lol.read())# except FileNotFoundError:# print("The file does not exist.")# The file does not exist.# None
Well, maybe stronger LLMs for the self-correction part. But the intimation logic or doing small simple but semi-open ended operations might be possible on machine, before sending to sentry etc.
Functions such that, when they fail can rewrite themselves till they pass, or can adapt themselves to the data so they never fail.
importtracebackfromtypingimportList,Dictimportreimportrequestsimportsysimportinspect# We use this to call the apidefsend_request_to_api(messages:List[Dict[str,str]])->str:url="http://localhost:3000/api/v1/chat_llama_cpp"headers={"Content-Type":"application/json"}auth=("user","password")data={"messages":messages,"temperature":0.2,"top_p":0.95,"top_k":40,"max_tokens":2048}response=requests.post(url,auth=auth,headers=headers,json=data)response_data=response.json()ifresponse.status_code==200:last_message=response_data["choices"][0]["message"]["content"]returnlast_messageelse:raiseException("nooooooooooooooooooo!!")defgenerate_fix_suggestion(function:str,traceback_info:List[str])->str:# Get the name of the function that raised the exceptionfunction_name=traceback_info[-1].split(",")[0].strip().split(" ")[1]# Get the parameter names and values of the function that raised the exceptionparam_info=""iffunction_name:try:frame=inspect.currentframe()whileframe:ifframe.f_code.co_name==function_name:breakframe=frame.f_backifframe:func=frame.f_globals.get(function_name)iffunc:params=inspect.signature(func).parametersparam_values={name:frame.f_locals.get(name)fornameinparams}param_info="\n".join(f"{name}: {value}"forname,valueinparam_values.items())except(KeyError,ValueError):passprompt=f"Function code: {function}\n\nTraceback:\n{''.join(traceback_info)}\n\nFunction: {function_name}\nParameters:\n{param_info}\n\nRewrite the complete function with a fix."messages=[{"role":"system","content":"You are an AI assistant that fixes code."},{"role":"user","content":prompt}]response=send_request_to_api(messages)# Extract Python code blocks from the responsecode_blocks=re.findall(r'```python\n(.*?)```',response,re.DOTALL)ifcode_blocks:# Return the first code block foundreturncode_blocks[0].strip()else:# Return the entire response if no code blocks are foundreturnresponsedefadapt_to_data(fn,*args,**kwargs):try:returnfn(*args,**kwargs)exceptExceptionase:exc_type,exc_value,exc_traceback=sys.exc_info()traceback_info=traceback.format_tb(exc_traceback)solution=generate_fix_suggestion(inspect.getsource(fn),traceback_info)print("------------------------------------------------")print(solution)print("------------------------------------------------")ifsolution:try:exec(solution)except:raiseeelse:raisee
Example: functions that expect a certain field without checking if they exist first:
data={"field1":3,"field2":["lol"]}defhandler(data):field1=data["field1"]field2=data["field2"]field3=data["field3"]returnfield2adapt_to_data(handler,data)# ------------------------------------------------# def handler(data):# if "field1" in data:# field1 = data["field1"]# else:# field1 = None# if "field2" in data:# field2 = data["field2"]# else:# field2 = None# if "field3" in data:# field3 = data["field3"]# else:# field3 = None# return field2# ------------------------------------------------
A lot of integrations code is going to be written by AI. Now, the core of any integration is a mapping and no matter how much automation you do, one finally ends up with this core problem, that each api - api mapping has to be done by hand. And then hire integration engineers etc etc. This kind of thing happens to be an entire enterprise play and different shitty ERPs talking to each other via shitty code is a huge mess. Lets make it worse with AI? 😊
"soft"-composing functions
Say function f1 has output signature o1 and function f2 has input signature i2, if i2 is a subset of o1, then we should be able to do f2(f1(x)) through this method. However, as we will see, if o2 can be derievd from o1 that should also work.
Implementation
Cool go given two api schema:
Generate a mapping
Test each mapping against mock inputs
Generate 10 total working mappings
Ask the human for feedback
The preferred mapping is deployed as config to an abstract integration core system
importtracebackfromtypingimportList,Dictimportreimportrequestsimportsysimportinspect# We use this to call the apidefsend_request_to_api(messages:List[Dict[str,str]])->str:url="http://localhost:3000/api/v1/chat_llama_cpp"headers={"Content-Type":"application/json"}auth=("user","password")# note: greater temperature to get diverse outputsdata={"messages":messages,"temperature":0.7,"top_p":0.95,"top_k":40,"max_tokens":2048}response=requests.post(url,auth=auth,headers=headers,json=data)response_data=response.json()ifresponse.status_code==200:last_message=response_data["choices"][0]["message"]["content"]returnlast_messageelse:raiseException("nooooooooooooooooooo!!")defgenerate_mapping_function(input_example,output_example):# Prepare the prompt for the LLMprompt=f"""Given the input schema:{input_example}And the desired output schema:{output_example}Generate a Python function that takes an input matching the input schema and returns an output matching the output schema. The function should be named 'mapping_function'."""# Call the LLM API to generate the mapping functionmessages=[{"role":"system","content":"You are an AI assistant that generates Python code."},{"role":"user","content":prompt}]response=send_request_to_api(messages)# Extract the generated function code from the responsefunction_code=re.findall(r'```python\n(.*?)```',response,re.DOTALL)iffunction_code:# Execute the generated function codeexec(function_code[0],globals())# Test the generated mapping function with the example inputtry:mapped_output=mapping_function(input_example)print("Generated mapping function:")print(function_code[0])print("\nTesting the mapping function:")print(f"Input: {input_example}")print(f"Output: {mapped_output}")# Validate the mapped output against the desired output schemaiflist(mapped_output.keys())==list(output_example.keys()):print("Mapping function generated successfully!")returnfunction_code[0]else:print("Mapped output does not match the desired output schema.")raiseException("Function generated is incorrect")exceptExceptionase:print(f"Error occurred while testing the mapping function: {e}")raiseeelse:print("No mapping function code found in the LLM response.")returnNone
Lets use this:
input_example={"first name":"string","last name":"string","age":"integer"}output_example={"status":"success","data":{"first name":"string","last name":"string","full name":"string","age":"integer"}}generated=[]whilelen(generated)<3:try:fn=generate_mapping_function(input_example,output_example)generated.append(fn)except:passprint("Please choose which one you think is the best:")forgeningenerated:print("------------------------------------------------")print(gen)print("------------------------------------------------")
This is what the output looks like:
Pleasechoosewhichoneyouthinkisthebest:------------------------------------------------defmapping_function(input_data):ifisinstance(input_data,dict)andset(input_data.keys())=={'first name','last name','age'}:first_name=input_data['first name']last_name=input_data['last name']age=input_data['age']full_name=first_name+' '+last_nameoutput_data={'status':'success','data':{'first name':first_name,'last name':last_name,'full name':full_name,'age':age}}returnoutput_dataelse:returnNone------------------------------------------------------------------------------------------------defmapping_function(input_dict):ifnotall(kininput_dictforkin('first name','last name','age')):return{'status':'failure','error':'Input dictionary is missing one or more required fields.'}first_name=input_dict['first name']last_name=input_dict['last name']age=input_dict['age']full_name=f"{first_name}{last_name}"return{'status':'success','data':{'first name':first_name,'last name':last_name,'full name':full_name,'age':age}}------------------------------------------------------------------------------------------------defmapping_function(input_data):ifnotall(kininput_dataforkin('first name','last name','age')):return{'status':'failure','error':'Missing required fields'}full_name=input_data['first name']+' '+input_data['last name']output_data={'status':'success','data':{'first name':input_data['first name'],'last name':input_data['last name'],'full name':full_name,'age':input_data['age']}}returnoutput_data------------------------------------------------
This was, of course, a toy example. If the testing is strong enough, and the rest of the stack where this plugs into generic enough, then even te human can be removed from the loop.
Now think of all those multi-vendor API integrations like payment gateways that can now be automated without killing oneself!
Lets create a workspace for local experimentation. We will not build anything here, just try to use whatever components are available. This is what a low-code workflow could look like.
Lets create a workflow in which:
A web server listens for all kinds of HTTP events.
Clients send the following information to the server:
HTTP request
Response and response status code
The server buffers events in batches of 1000 and uploads them on to s3.
Train a small LLM model on the data to be used to predict whether the request was valid.
A representation of the process using a sequence diagram:
This model could be used to predict if a request will fail before serving it. It could also be used to classify requests as malicious etc.
whiletrue;do# Generate a random customer IDcustomer_id=$((RANDOM%10000001))# Determine the status code based on the customer IDif[$customer_id-gt10000000];thenstatus_code="1"elif[$customer_id-le10000];thenstatus_code="1"elsestatus_code="0"fi# Make the API callcurl--header"Content-Type: application/json"\--requestPOST\--data"{\"text\":\"GET /api/v1/customer/$customer_id\",\"label\":\"$status_code\"}"\http://localhost:8080/application-1-tag-a-tag-b-whatever
done
Verify that the data is being dumped in the right place with the correct format:
Now lets test the second leg of this, the model. Since we want to use the model for predicting the status code given the data, we will use classification as our task for fine-tuning the model.
Lets use the bert-base-uncased model for now, as it is small enough to run on a CPU on a laptop.
We also create a model on huggingface hub to store the model once it is trained: ixaxaar/geniusrise-api-status-code-prediction.
geniusHuggingFaceClassificationFineTunerrise\batch\--input_s3_bucketgeniusrise-test\--input_s3_foldertrain\batch\--output_s3_bucketgeniusrise-test\--output_s3_folderapi-prediction\none\fine_tune\--args\model_name="bert-base-uncased"\tokenizer_name="bert-base-uncased"\num_train_epochs=2\per_device_train_batch_size=64\model_class=BertForSequenceClassification\tokenizer_class=BertTokenizer\data_masked=True\data_extractor_lambda="lambda x: x['data']"\hf_repo_id=ixaxaar/geniusrise-api-status-code-prediction\hf_commit_message="initial local testing"\hf_create_pr=True\hf_token=hf_lalala
🚀 Initialized Task with ID: HuggingFaceClassificationFineTuner772627a0-43a5-4f9d-9b0f-4362d69ba08c
Found credentials in shared credentials file: ~/.aws/credentials
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Loading dataset from /tmp/tmp3h3wav4h/train
New labels detected, ignore if fine-tuning
Map: 100%|██████████████████████████| 300/300 [00:00<00:00, 4875.76 examples/s]
{'train_runtime': 13.3748, 'train_samples_per_second': 44.861, 'train_steps_per_second': 22.43, 'train_loss': 0.6400579833984374, 'epoch': 2.0}
100%|████████████████████████████████████████| 300/300 [00:13<00:00, 22.43it/s]
pytorch_model.bin: 100%|████████████████████| 438M/438M [01:29<00:00, 4.88MB/s]
Successfully executed the bolt method: fine_tune 👍
Or we could simply use the yaml we created in the previous step:
geniusriseup
See the status of the deployment:
# Find the pod idgeniuspodshow\--namespacegeniusrise\--context_namearn:aws:eks:us-east-1:genius-dev:cluster/geniusrise-dev2>&1|grepRunning
geniuspoddescribe\webhook-75c4bff67d-hbhts\--namespacegeniusrise\--context_namearn:aws:eks:us-east-1:genius-dev:cluster/geniusrise-dev
geniusdeploymentdescribe\webhook\--namespacegeniusrise\--context_namearn:aws:eks:us-east-1:genius-dev:cluster/geniusrise-dev
geniusservicedescribe\webhook\--namespacegeniusrise\--context_namearn:aws:eks:us-east-1:genius-dev:cluster/geniusrise-dev
SNOMED-CT: is a knowledge graph of standard medical terminology
IHTSDO: a standards body for medical terminologies in a number of countries.
UMLS: unified medical language system is a set of files and software that brings together many health and biomedical vocabularies and standards together.
We could choose a large language model and train the model on the NER fine-tuning task. The model would then be able to recognize and tag medical terms in any given text data.
We use an LLM to create a vectorized layer over SNOMED-CT. This layer can be used to semantically search for "seed" nodes in the graph. We can then use these seed nodes to traverse nodes a few hops adjacent to the seed nodes.
We use the knowledge graph search results to not only annotate each node seen in the EHR document, but also add additional information about those nodes derived from its adjacent nodes. But first, we also need to make sure that we query the right information instead of simply vectorized chunks and throwing it at semantic search. We would need a "traditional" pipeline for this - lemmatization followed by POS tagging. We use both proper nouns and out of vocabulary words as search query terms.
#!/bin/bash# Prompt for project detailsread-p"Enter your project name: "project_nameread-p"Enter your name: "author_nameread-p"Enter your email: "author_emailread-p"Enter your GitHub username: "github_usernameread-p"Enter a brief description of your project: "project_description# Create project structuremkdir$project_namecd$project_namemkdir$project_nametests# Create basic filestouchREADME.mdtouchrequirements.txttouchsetup.pytouchMakefiletouch$project_name/__init__.pytouchtests/__init__.py# Populate README.mdecho"# $project_name">README.mdecho"\n$project_description">>README.md# Populate setup.pycat<<EOL>setup.pyfromsetuptoolsimportsetup,find_packageswithopen("README.md","r",encoding="utf-8")asfh:long_description=fh.read()setup(name='$project_name',version='0.1.0',packages=find_packages(exclude=["tests","tests.*"]),install_requires=[],python_requires='>=3.10',author='$author_name',author_email='$author_email',description='$project_description',long_description=long_description,long_description_content_type='text/markdown',url='https://github.com/$github_username/$project_name',classifiers=['Programming Language :: Python :: 3','License :: OSI Approved :: MIT License','Operating System :: OS Independent',],)EOL# Populate Makefilecat<<EOL>Makefilesetup:@pipinstall-r./requirements.txttest:@coveragerun-mpytest-v./testspublish:@pythonsetup.pysdistbdist_wheel@twineuploaddist/$project_name-\$${VERSION}-*--verboseEOL# Set up the virtual environment and install necessary packagesvirtualenvvenv-p`whichpython3.10`sourcevenv/bin/activatepipinstalltwinesetuptoolspytestcoveragegeniusrisepipfreeze>requirements.txt# Fetch .pre-commit-config.yaml and .gitignore from geniusrise/geniusrisecurl-Ohttps://raw.githubusercontent.com/geniusrise/geniusrise/master/.pre-commit-config.yamlcurl-Ohttps://raw.githubusercontent.com/geniusrise/geniusrise/master/.gitignoreecho"Project $project_name initialized!"
Create a install script out of this and execute it:
Lets prepare the knowledge graph by vectorizing each node's knowledge into a vectorized flat memory. This is a periodic activity that one needs to do whenever a new version of SNOMED-CT is released (typically bi-annually).
3. Uploading to ACR with Multiple Local Directories¶
In this example, we upload a Docker image to Azure Container Registry (ACR) and specify multiple local directories to be copied into the Docker container.
# First, create a Dockerfile that copies multiple directories# Then use the following commandgeniusdockerpackagemulti_dir_appacr\--auth'{"acr_username": "username", "acr_password": "password", "acr_login_server": "login_server"}'\--local_dir"./app ./config"
4. Uploading to GCR with Custom Base Image, Packages, and OS Packages¶
This example demonstrates how to upload a Docker image to Google Container Registry (GCR) with a custom base image, Python packages, and OS packages.
This example shows how to upload a Docker image to Quay with all available customizations like base image, working directory, local directory, Python packages, OS packages, and environment variables.
This guide provides comprehensive instructions on how to deploy and manage resources in a Kubernetes cluster using the Geniusrise platform. The guide covers the following functionalities:
To create a new cron job, you can use the create_cronjob sub-command. You'll need to specify the name, Docker image, command to run, and the cron schedule.
In this example, the --config=my_config.yaml would be used to read the common arguments from the YAML file, and the rest of the arguments would be taken from the command line.
🚀 The Job class is responsible for managing Kubernetes Jobs. It extends the Deployment class
and provides additional functionalities specific to Kubernetes Jobs.
🚀 The CronJob class is responsible for managing Kubernetes CronJobs. It extends the Job class
and provides additional functionalities specific to Kubernetes CronJobs.
AirflowRunner is a utility for managing and orchestrating Airflow DAGs. It is designed
to provide a command-line interface (CLI) for creating, describing, showing, deleting,
and getting the status of Airflow DAGs.
This class uses the Airflow models to interact with DAGs and DockerOperator to
run tasks in Docker containers. It is aimed to simplify the deployment and management
of Airflow tasks, providing a straightforward way to deploy DAGs with Docker tasks
from the command line.
CLI Usage
genius airflow sub-command
Sub-commands
create: Create a new DAG with the given parameters and Docker task.
genius airflow create [options]
describe: Describe a specific DAG by its ID.
genius airflow describe --dag_id example_dag
show: Show all available DAGs in the Airflow environment.
genius airflow show
delete: Delete a specific DAG by its ID.
genius airflow delete --dag_id example_dag
status: Get the status of a specific DAG by its ID.
genius airflow status --dag_id example_dag --airflow_api_base_url http://localhost:8080/api/v1
Each sub-command supports various options to specify the details of the DAG or the
Docker task, such as the schedule interval, start date, owner, image, command, and
more.
DockerResourceManager is a utility for managing Docker resources, including containers and images. It provides a command-line interface (CLI) for various Docker operations, such as listing, inspecting, creating, starting, and stopping containers, as well as managing images.
This class uses the Docker SDK for Python to interact with the Docker daemon, offering a convenient way to manage Docker containers and images from the command line.
CLI Usage
genius docker sub-command
Sub-commands
list_containers: List all containers, with an option to include stopped containers.
genius docker list_containers [--all]
inspect_container: Inspect a specific container by its ID.
genius docker inspect_container <container_id>
create_container: Create a new container with specified image, command, and other parameters.
genius docker create_container <image> [options]
start_container: Start a container by its ID.
genius docker start_container <container_id>
stop_container: Stop a container by its ID.
genius docker stop_container <container_id>
list_images: List all Docker images available on the local system.
genius docker list_images
inspect_image: Inspect a specific image by its ID.
genius docker inspect_image <image_id>
pull_image: Pull an image from a Docker registry.
genius docker pull_image <image>
push_image: Push an image to a Docker registry.
genius docker push_image <image>
Each sub-command supports various options to specify the details of the container or image operation, such as environment variables, port mappings, volume mappings, and more.
Attributes:
Name
Type
Description
client
The Docker client connection to interact with the Docker daemon.
log
Logger for the class to log information, warnings, and errors.
console
Rich console object to print formatted and styled outputs.
Methods
connect: Method to establish a connection to the Docker daemon.
list_containers: Method to list all containers, with an option to include stopped ones.
inspect_container: Method to inspect details of a specific container.
create_container: Method to create a new container with given parameters.
start_container: Method to start a specific container.
stop_container: Method to stop a specific container.
list_images: Method to list all Docker images.
inspect_image: Method to inspect a specific image.
pull_image: Method to pull an image from a Docker registry.
push_image: Method to push an image to a Docker registry.
Note
Ensure that the Docker daemon is running and accessible at the specified URL.
Make sure to have the necessary permissions to interact with the Docker daemon and manage containers and images.
DockerSwarmManager is a utility for managing Docker Swarm services, including creating, inspecting, updating, and removing services. It extends DockerResourceManager to provide swarm-specific functionalities and commands via a command-line interface (CLI).
The manager interacts with the Docker Swarm API, offering a convenient way to manage Swarm services, nodes, and other swarm-related tasks from the command line.
CLI Usage
genius docker swarm sub-command
Sub-commands
list_nodes: List all nodes in the Docker Swarm.
genius docker swarm list_nodes
inspect_node: Inspect a specific Swarm node by its ID.
genius docker swarm inspect_node <node_id>
create_service: Create a new service in the Docker Swarm with comprehensive specifications.
genius docker swarm create_service [options]
list_services: List all services in the Docker Swarm.
genius docker swarm list_services
inspect_service: Inspect a specific service by its ID.
genius docker swarm inspect_service <service_id>
update_service: Update an existing service with new parameters.
genius docker swarm update_service <service_id> [options]
remove_service: Remove a service from the Docker Swarm.
genius docker swarm remove_service <service_id>
service_logs: Retrieve logs of a Docker Swarm service.
genius docker swarm service_logs <service_id> [--tail] [--follow]
scale_service: Scale a service to a specified number of replicas.
genius docker swarm scale_service <service_id> <replicas>
Each sub-command supports various options to specify the details of the swarm node or service operation. These options include node and service IDs, image and command specifications for services, environment variables, resource limits, and much more.
Attributes:
Name
Type
Description
swarm_client
The Docker Swarm client connection to interact with the Docker Swarm API.
log
Logger for the class to log information, warnings, and errors.
console
Rich console object to print formatted and styled outputs.
Methods
connect_to_swarm: Method to establish a connection to the Docker Swarm.
list_nodes: Method to list all nodes in the Docker Swarm.
inspect_node: Method to inspect details of a specific Swarm node.
create_service: Method to create a new service with given specifications.
list_services: Method to list all services in the Docker Swarm.
inspect_service: Method to inspect a specific service.
update_service: Method to update an existing service with new parameters.
remove_service: Method to remove a service from the Docker Swarm.
get_service_logs: Method to retrieve logs of a Docker Swarm service.
scale_service: Method to scale a service to a specified number of replicas.
Note
Ensure that the Docker Swarm is initialized and running.
Make sure to have the necessary permissions to interact with the Docker Swarm and manage services and nodes.
: Choose the type of output data: batch or streaming.
{none,redis,postgres,dynamodb,prometheus}
: Select the type of state manager: none, redis, postgres, or
dynamodb.
method_name
: The name of the method to execute on the spout.
Options genius TestSpoutCtlSpout rise
--buffer_sizeBUFFER_SIZE: Specify the size of the buffer.
--output_folderOUTPUT_FOLDER: Specify the directory where output files should be stored
temporarily
--output_kafka_topicOUTPUT_KAFKA_TOPIC: Kafka output topic for streaming spouts.
--output_kafka_cluster_connection_stringOUTPUT_KAFKA_CLUSTER_CONNECTION_STRING: Kafka connection string for streaming spouts.
--output_s3_bucketOUTPUT_S3_BUCKET: Provide the name of the S3 bucket for output storage.
--output_s3_folderOUTPUT_S3_FOLDER: Indicate the S3 folder for output storage.
--redis_hostREDIS_HOST: Enter the host address for the Redis server.
--redis_portREDIS_PORT: Enter the port number for the Redis server.
--redis_dbREDIS_DB: Specify the Redis database to be used.
--postgres_hostPOSTGRES_HOST: Enter the host address for the PostgreSQL server.
--postgres_portPOSTGRES_PORT: Enter the port number for the PostgreSQL server.
--postgres_userPOSTGRES_USER: Provide the username for the PostgreSQL server.
--postgres_passwordPOSTGRES_PASSWORD: Provide the password for the PostgreSQL server.
--postgres_databasePOSTGRES_DATABASE: Specify the PostgreSQL database to be used.
--postgres_tablePOSTGRES_TABLE: Specify the PostgreSQL table to be used.
--dynamodb_table_nameDYNAMODB_TABLE_NAME: Provide the name of the DynamoDB table.
--dynamodb_region_nameDYNAMODB_REGION_NAME: Specify the AWS region for DynamoDB.
--prometheus_gatewayPROMETHEUS_GATEWAY: Specify the prometheus gateway URL.
--args...: Additional keyword arguments to pass to the spout.
: Choose the type of output data: batch or streaming.
{none,redis,postgres,dynamodb,prometheus}
: Select the type of state manager: none, redis, postgres, or
dynamodb.
{k8s}
: Choose the type of deployment.
method_name
: The name of the method to execute on the spout.
Options genius TestSpoutCtlSpout deploy
--buffer_sizeBUFFER_SIZE: Specify the size of the buffer.
--output_folderOUTPUT_FOLDER: Specify the directory where output files should be stored
temporarily
--output_kafka_topicOUTPUT_KAFKA_TOPIC: Kafka output topic for streaming spouts.
--output_kafka_cluster_connection_stringOUTPUT_KAFKA_CLUSTER_CONNECTION_STRING: Kafka connection string for streaming spouts.
--output_s3_bucketOUTPUT_S3_BUCKET: Provide the name of the S3 bucket for output storage.
--output_s3_folderOUTPUT_S3_FOLDER: Indicate the S3 folder for output storage.
--redis_hostREDIS_HOST: Enter the host address for the Redis server.
--redis_portREDIS_PORT: Enter the port number for the Redis server.
--redis_dbREDIS_DB: Specify the Redis database to be used.
--postgres_hostPOSTGRES_HOST: Enter the host address for the PostgreSQL server.
--postgres_portPOSTGRES_PORT: Enter the port number for the PostgreSQL server.
--postgres_userPOSTGRES_USER: Provide the username for the PostgreSQL server.
--postgres_passwordPOSTGRES_PASSWORD: Provide the password for the PostgreSQL server.
--postgres_databasePOSTGRES_DATABASE: Specify the PostgreSQL database to be used.
--postgres_tablePOSTGRES_TABLE: Specify the PostgreSQL table to be used.
--dynamodb_table_nameDYNAMODB_TABLE_NAME: Provide the name of the DynamoDB table.
--dynamodb_region_nameDYNAMODB_REGION_NAME: Specify the AWS region for DynamoDB.
--prometheus_gatewayPROMETHEUS_GATEWAY: Specify the prometheus gateway URL.
--k8s_kind{deployment,service,job,cron_job}: Choose the type of kubernetes resource.
--k8s_nameK8S_NAME: Name of the Kubernetes resource.
--k8s_imageK8S_IMAGE: Docker image for the Kubernetes resource.
--k8s_replicasK8S_REPLICAS: Number of replicas.
--k8s_env_varsK8S_ENV_VARS: Environment variables as a JSON string.
--k8s_cpuK8S_CPU: CPU requirements.
--k8s_memoryK8S_MEMORY: Memory requirements.
--k8s_storageK8S_STORAGE: Storage requirements.
--k8s_gpuK8S_GPU: GPU requirements.
--k8s_kube_config_pathK8S_KUBE_CONFIG_PATH: Name of the Kubernetes cluster local config.
--k8s_api_keyK8S_API_KEY: GPU requirements.
--k8s_api_hostK8S_API_HOST: GPU requirements.
--k8s_verify_sslK8S_VERIFY_SSL: GPU requirements.
--k8s_ssl_ca_certK8S_SSL_CA_CERT: GPU requirements.
--k8s_cluster_nameK8S_CLUSTER_NAME: Name of the Kubernetes cluster.
--k8s_context_nameK8S_CONTEXT_NAME: Name of the kubeconfig context.
--k8s_namespaceK8S_NAMESPACE: Kubernetes namespace.
--k8s_labelsK8S_LABELS: Labels for Kubernetes resources, as a JSON string.
--k8s_annotationsK8S_ANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--k8s_portK8S_PORT: Port to run the spout on as a service.
--k8s_target_portK8S_TARGET_PORT: Port to expose the spout on as a service.
--k8s_scheduleK8S_SCHEDULE: Schedule to run the spout on as a cron job.
--args...: Additional keyword arguments to pass to the spout.
: Choose the type of input data: batch or streaming.
{batch,streaming,stream_to_batch}
: Choose the type of output data: batch or streaming.
{none,redis,postgres,dynamodb,prometheus}
: Select the type of state manager: none, redis, postgres, or
dynamodb.
method_name
: The name of the method to execute on the bolt.
Options genius TestBoltCtlBolt rise
--buffer_sizeBUFFER_SIZE: Specify the size of the buffer.
--input_folderINPUT_FOLDER: Specify the directory where output files should be stored
temporarily
--input_kafka_topicINPUT_KAFKA_TOPIC: Kafka output topic for streaming spouts.
--input_kafka_cluster_connection_stringINPUT_KAFKA_CLUSTER_CONNECTION_STRING: Kafka connection string for streaming spouts.
--input_kafka_consumer_group_idINPUT_KAFKA_CONSUMER_GROUP_ID: Kafka consumer group id to use.
--input_s3_bucketINPUT_S3_BUCKET: Provide the name of the S3 bucket for output storage.
--input_s3_folderINPUT_S3_FOLDER: Indicate the S3 folder for output storage.
--output_folderOUTPUT_FOLDER: Specify the directory where output files should be stored
temporarily
--output_kafka_topicOUTPUT_KAFKA_TOPIC: Kafka output topic for streaming spouts.
--output_kafka_cluster_connection_stringOUTPUT_KAFKA_CLUSTER_CONNECTION_STRING: Kafka connection string for streaming spouts.
--output_s3_bucketOUTPUT_S3_BUCKET: Provide the name of the S3 bucket for output storage.
--output_s3_folderOUTPUT_S3_FOLDER: Indicate the S3 folder for output storage.
--redis_hostREDIS_HOST: Enter the host address for the Redis server.
--redis_portREDIS_PORT: Enter the port number for the Redis server.
--redis_dbREDIS_DB: Specify the Redis database to be used.
--postgres_hostPOSTGRES_HOST: Enter the host address for the PostgreSQL server.
--postgres_portPOSTGRES_PORT: Enter the port number for the PostgreSQL server.
--postgres_userPOSTGRES_USER: Provide the username for the PostgreSQL server.
--postgres_passwordPOSTGRES_PASSWORD: Provide the password for the PostgreSQL server.
--postgres_databasePOSTGRES_DATABASE: Specify the PostgreSQL database to be used.
--postgres_tablePOSTGRES_TABLE: Specify the PostgreSQL table to be used.
--dynamodb_table_nameDYNAMODB_TABLE_NAME: Provide the name of the DynamoDB table.
--dynamodb_region_nameDYNAMODB_REGION_NAME: Specify the AWS region for DynamoDB.
--prometheus_gatewayPROMETHEUS_GATEWAY: Specify the prometheus gateway URL.
--args...: Additional keyword arguments to pass to the bolt.
: Choose the type of input data: batch or streaming.
{batch,streaming,stream_to_batch}
: Choose the type of output data: batch or streaming.
{none,redis,postgres,dynamodb,prometheus}
: Select the type of state manager: none, redis, postgres, or
dynamodb.
{k8s}
: Choose the type of deployment.
method_name
: The name of the method to execute on the spout.
Options genius TestBoltCtlBolt deploy
--buffer_sizeBUFFER_SIZE: Specify the size of the buffer.
--input_folderINPUT_FOLDER: Specify the directory where output files should be stored
temporarily
--input_kafka_topicINPUT_KAFKA_TOPIC: Kafka output topic for streaming spouts.
--input_kafka_cluster_connection_stringINPUT_KAFKA_CLUSTER_CONNECTION_STRING: Kafka connection string for streaming spouts.
--input_kafka_consumer_group_idINPUT_KAFKA_CONSUMER_GROUP_ID: Kafka consumer group id to use.
--input_s3_bucketINPUT_S3_BUCKET: Provide the name of the S3 bucket for output storage.
--input_s3_folderINPUT_S3_FOLDER: Indicate the S3 folder for output storage.
--output_folderOUTPUT_FOLDER: Specify the directory where output files should be stored
temporarily
--output_kafka_topicOUTPUT_KAFKA_TOPIC: Kafka output topic for streaming spouts.
--output_kafka_cluster_connection_stringOUTPUT_KAFKA_CLUSTER_CONNECTION_STRING: Kafka connection string for streaming spouts.
--output_s3_bucketOUTPUT_S3_BUCKET: Provide the name of the S3 bucket for output storage.
--output_s3_folderOUTPUT_S3_FOLDER: Indicate the S3 folder for output storage.
--redis_hostREDIS_HOST: Enter the host address for the Redis server.
--redis_portREDIS_PORT: Enter the port number for the Redis server.
--redis_dbREDIS_DB: Specify the Redis database to be used.
--postgres_hostPOSTGRES_HOST: Enter the host address for the PostgreSQL server.
--postgres_portPOSTGRES_PORT: Enter the port number for the PostgreSQL server.
--postgres_userPOSTGRES_USER: Provide the username for the PostgreSQL server.
--postgres_passwordPOSTGRES_PASSWORD: Provide the password for the PostgreSQL server.
--postgres_databasePOSTGRES_DATABASE: Specify the PostgreSQL database to be used.
--postgres_tablePOSTGRES_TABLE: Specify the PostgreSQL table to be used.
--dynamodb_table_nameDYNAMODB_TABLE_NAME: Provide the name of the DynamoDB table.
--dynamodb_region_nameDYNAMODB_REGION_NAME: Specify the AWS region for DynamoDB.
--prometheus_gatewayPROMETHEUS_GATEWAY: Specify the prometheus gateway URL.
--k8s_kind{deployment,service,job,cron_job}: Choose the type of kubernetes resource.
--k8s_nameK8S_NAME: Name of the Kubernetes resource.
--k8s_imageK8S_IMAGE: Docker image for the Kubernetes resource.
--k8s_replicasK8S_REPLICAS: Number of replicas.
--k8s_env_varsK8S_ENV_VARS: Environment variables as a JSON string.
--k8s_cpuK8S_CPU: CPU requirements.
--k8s_memoryK8S_MEMORY: Memory requirements.
--k8s_storageK8S_STORAGE: Storage requirements.
--k8s_gpuK8S_GPU: GPU requirements.
--k8s_kube_config_pathK8S_KUBE_CONFIG_PATH: Name of the Kubernetes cluster local config.
--k8s_api_keyK8S_API_KEY: GPU requirements.
--k8s_api_hostK8S_API_HOST: GPU requirements.
--k8s_verify_sslK8S_VERIFY_SSL: GPU requirements.
--k8s_ssl_ca_certK8S_SSL_CA_CERT: GPU requirements.
--k8s_cluster_nameK8S_CLUSTER_NAME: Name of the Kubernetes cluster.
--k8s_context_nameK8S_CONTEXT_NAME: Name of the kubeconfig context.
--k8s_namespaceK8S_NAMESPACE: Kubernetes namespace.
--k8s_labelsK8S_LABELS: Labels for Kubernetes resources, as a JSON string.
--k8s_annotationsK8S_ANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--k8s_portK8S_PORT: Port to run the spout on as a service.
--k8s_target_portK8S_TARGET_PORT: Port to expose the spout on as a service.
--k8s_scheduleK8S_SCHEDULE: Schedule to run the spout on as a cron job.
--args...: Additional keyword arguments to pass to the spout.
usage: genius pod status [-h] [--kube_config_path
KUBE_CONFIG_PATH] [--cluster_name CLUSTER_NAME] [--context_name
CONTEXT_NAME] [--namespace NAMESPACE] [--labels LABELS]
[--annotations ANNOTATIONS] [--api_key API_KEY] [--api_host
API_HOST] [--verify_ssl VERIFY_SSL] [--ssl_ca_cert SSL_CA_CERT]
name
name
: Name of the Kubernetes pod.
Options genius pod status
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--followFOLLOW: Whether to follow the logs.
--tailTAIL: Number of lines to show from the end of the logs.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--replicasREPLICAS: Number of replicas.
--env_varsENV_VARS: Environment variables as a JSON string.
--cpuCPU: CPU requirements.
--memoryMEMORY: Memory requirements.
--storageSTORAGE: Storage requirements.
--gpuGPU: GPU requirements.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--replicasREPLICAS: Number of replicas.
--portPORT: Service port.
--target_portTARGET_PORT: Container target port.
--env_varsENV_VARS: Environment variables as a JSON string.
--cpuCPU: CPU requirements.
--memoryMEMORY: Memory requirements.
--storageSTORAGE: Storage requirements.
--gpuGPU: GPU requirements.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--env_varsENV_VARS: Environment variables as a JSON string.
--cpuCPU: CPU requirements.
--memoryMEMORY: Memory requirements.
--storageSTORAGE: Storage requirements.
--gpuGPU: GPU requirements.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--env_varsENV_VARS: Environment variables as a JSON string.
--cpuCPU: CPU requirements.
--memoryMEMORY: Memory requirements.
--storageSTORAGE: Storage requirements.
--gpuGPU: GPU requirements.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--kube_config_pathKUBE_CONFIG_PATH: Path to the kubeconfig file.
--cluster_nameCLUSTER_NAME: Name of the Kubernetes cluster.
--context_nameCONTEXT_NAME: Name of the kubeconfig context.
--namespaceNAMESPACE: Kubernetes namespace.
--labelsLABELS: Labels for Kubernetes resources, as a JSON string.
--annotationsANNOTATIONS: Annotations for Kubernetes resources, as a JSON string.
--api_keyAPI_KEY: API key for Kubernetes cluster.
--api_hostAPI_HOST: API host for Kubernetes cluster.
--verify_sslVERIFY_SSL: Whether to verify SSL certificates.
--ssl_ca_certSSL_CA_CERT: Path to the SSL CA certificate.
--authAUTH: Authentication credentials as a JSON string.
--base_imageBASE_IMAGE: The base image to use for the Docker container.
--workdirWORKDIR: The working directory in the Docker container.
--local_dirLOCAL_DIR: The local directory to copy into the Docker container.
--packages[PACKAGES ...]: List of Python packages to install in the Docker container.
--os_packages[OS_PACKAGES ...]: List of OS packages to install in the Docker container.
--env_varsENV_VARS: Environment variables to set in the Docker container.
You can manage Kubernetes deployments using the genius CLI. Here are some example commands:
# Show pods in a namespacegeniuspodshow--namespacegeniusrise--context_namearn:aws:eks:us-east-1:genius-dev:cluster/geniusrise
# Scale a deploymentgeniuspodscale--namespacegeniusrise--context_namearn:aws:eks:us-east-1:genius-dev:cluster/geniusrise--nametestspout--replicas3# Delete a deploymentgeniuspoddelete--namespacegeniusrise--context_namearn:aws:eks:us-east-1:genius-dev:cluster/geniusrise--nametestspout
You can manage ECS deployments using the genius CLI. Here are some example commands:
# Show tasks in a clustergeniusecsshow--clustergeniusrise-cluster--account_id123456789012# Scale a servicegeniusecsscale--clustergeniusrise-cluster--account_id123456789012--namepostgresbolt--desired_count3# Delete a servicegeniusecsdelete--clustergeniusrise-cluster--account_id123456789012--namepostgresbolt
TextClassificationAPI leveraging Hugging Face's transformers for text classification tasks.
This API provides an interface to classify text into various categories like sentiment, topic, intent, etc.
Attributes:
Name
Type
Description
model
AutoModelForSequenceClassification
A Hugging Face model for sequence classification.
tokenizer
AutoTokenizer
A tokenizer for preprocessing text.
hf_pipeline
Pipeline
A Hugging Face pipeline for text classification.
Methods
classify(self): Classifies text using the model and tokenizer.
classification_pipeline(self): Classifies text using the Hugging Face pipeline.
initialize_pipeline(self): Lazy initialization of the classification pipeline.
Accepts text input and returns classification results using the Hugging Face pipeline.
This method uses the Hugging Face pipeline for efficient and robust text classification. It's suitable for various
classification tasks such as sentiment analysis, topic classification, and intent recognition.
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the original input text and the classification results.
Example CURL Request for text classification:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/classification_pipeline-H"Content-Type: application/json"-d'{"text": "The movie was fantastic, with great acting and plot."}'|jq
Accepts text input and returns classification results. The method uses the model and tokenizer to classify the text
and provide the likelihood of each class label.
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the original input text and the classification scores for each label.
Example CURL Request for text classification:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/classify-H"Content-Type: application/json"-d'{ "text": "tata sons lost a major contract to its rival mahindra motors" }'|jq
Represents a Natural Language Inference (NLI) API leveraging Hugging Face's transformer models. This class is capable of
handling various NLI tasks such as entailment, classification, similarity checking, and more. Utilizes CherryPy for exposing
API endpoints that can be interacted with via standard HTTP requests.
Attributes:
Name
Type
Description
model
AutoModelForSequenceClassification
The loaded Hugging Face model for sequence classification tasks.
tokenizer
AutoTokenizer
The tokenizer corresponding to the model, used for processing input text.
CLI Usage Example:
For interacting with the NLI API, you would typically start the server using a command similar to one listed in the provided examples.
After the server is running, you can use CURL commands to interact with the different endpoints.
Endpoint for classifying the input text into one of the provided candidate labels using zero-shot classification.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments, typically containing 'text' and 'candidate_labels'.
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the input text, candidate labels, and classification scores.
Example CURL Request:
curl-XPOSTlocalhost:3000/api/v1/classify\-H"Content-Type: application/json"\-d'{ "text": "The new movie is a thrilling adventure in space", "candidate_labels": ["entertainment", "politics", "business"] }'
Detects the intent of the input text from a list of possible intents.
Parameters:
Name
Type
Description
Default
text
str
The input text.
required
intents
List[str]
A list of possible intents.
required
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the input text and detected intent with its score.
Example CURL Request:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/detect_intent\-H"Content-Type: application/json"\-d'{ "text": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.", "intents": ["teach","sell","note","advertise","promote"] }'|jq
Endpoint for evaluating the entailment relationship between a premise and a hypothesis. It returns the relationship
scores across possible labels like entailment, contradiction, and neutral.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments, typically containing 'premise' and 'hypothesis'.
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the premise, hypothesis, and their relationship scores.
Example CURL Request:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/entailment\-H"Content-Type: application/json"\\\-d'{ "premise": "This a very good entry level smartphone, battery last 2-3 days after fully charged when connected to the internet. No memory lag issue when playing simple hidden object games. Performance is beyond my expectation, i bought it with a good bargain, couldnt ask for more!", "hypothesis": "the phone has an awesome battery life" }'|jq
Performs fact checking on a statement given a context.
Parameters:
Name
Type
Description
Default
context
str
The context or background information.
required
statement
str
The statement to fact check.
required
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing fact checking scores.
Example CURL Request:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/fact_checking\-H"Content-Type: application/json"\-d'{ "context": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.", "statement": "The author is looking for a home loan" }'|jq
Performs question answering for multiple choice questions.
Parameters:
Name
Type
Description
Default
question
str
The question text.
required
choices
List[str]
A list of possible answers.
required
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the scores for each answer choice.
Example CURL Request:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/question_answering\-H"Content-Type: application/json"\-d'{ "question": "[ML-1T-2] is the dimensional formula of", "choices": ["force", "coefficient of friction", "modulus of elasticity", "energy"] }'|jq
Evaluates the textual similarity between two texts.
Parameters:
Name
Type
Description
Default
text1
str
The first text.
required
text2
str
The second text.
required
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing similarity score.
Example CURL Request:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/textual_similarity\-H"Content-Type: application/json"\-d'{ "text1": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.", "text2": "There is something magical about training neural networks. Their simplicity coupled with their power is astonishing." }'|jq
Performs zero-shot classification using the Hugging Face pipeline.
It allows classification of text without explicitly provided labels.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments, typically containing 'premise' and 'hypothesis'.
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the premise, hypothesis, and their classification scores.
Example CURL Request for zero-shot classification:
curl-XPOSTlocalhost:3000/api/v1/zero_shot_classification-H"Content-Type: application/json"-d'{ "premise": "A new study shows that the Mediterranean diet is good for heart health.", "hypothesis": "The study is related to diet and health." }'|jq
InstructionAPI is designed for generating text based on prompts using instruction-tuned language models.
It serves as an interface to Hugging Face's pre-trained instruction-tuned models, providing a flexible API
for various text generation tasks. It can be used in scenarios ranging from generating creative content to
providing instructions or answers based on the prompts.
Attributes:
Name
Type
Description
model
Any
The loaded instruction-tuned language model.
tokenizer
Any
The tokenizer for processing text suitable for the model.
Methods
complete(**kwargs: Any) -> Dict[str, Any]:
Generates text based on the given prompt and decoding strategy.
listen(**model_args: Any) -> None:
Starts a server to listen for text generation requests.
Handles chat interaction using the Hugging Face pipeline. This method enables conversational text generation,
simulating a chat-like interaction based on user and system prompts.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments containing 'user_prompt' and 'system_prompt'.
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the user prompt, system prompt, and chat interaction results.
Example CURL Request for chat interaction:
/usr/bin/curl-XPOSTlocalhost:3001/api/v1/chat-H"Content-Type: application/json"-d'{ "user_prompt": "What is the capital of France?", "system_prompt": "The capital of France is" }'|jq
Handles POST requests to generate chat completions using the llama.cpp engine. This method accepts various
parameters for customizing the chat completion request, including messages, sampling settings, and more.
Parameters:
Name
Type
Description
Default
messages
List[Dict[str, str]]
The chat messages for generating a response.
required
functions
Optional[List[Dict]]
A list of functions to use for the chat completion (advanced usage).
required
function_call
Optional[Dict]
A function call to use for the chat completion (advanced usage).
required
tools
Optional[List[Dict]]
A list of tools to use for the chat completion (advanced usage).
required
tool_choice
Optional[Dict]
A tool choice option for the chat completion (advanced usage).
required
temperature
float
The temperature to use for sampling, controlling randomness.
required
top_p
float
The nucleus sampling's top-p parameter, controlling diversity.
required
top_k
int
The top-k sampling parameter, limiting the token selection pool.
required
min_p
float
The minimum probability threshold for sampling.
required
typical_p
float
The typical-p parameter for locally typical sampling.
required
stream
bool
Flag to stream the results.
required
stop
Optional[Union[str, List[str]]]
Tokens or sequences where generation should stop.
required
seed
Optional[int]
Seed for random number generation to ensure reproducibility.
required
response_format
Optional[Dict]
Specifies the format of the generated response.
required
max_tokens
Optional[int]
Maximum number of tokens to generate.
required
presence_penalty
float
Penalty for token presence to discourage repetition.
required
frequency_penalty
float
Penalty for token frequency to discourage common tokens.
required
repeat_penalty
float
Penalty applied to tokens that are repeated.
required
tfs_z
float
Tail-free sampling parameter to adjust the likelihood of tail tokens.
required
mirostat_mode
int
Mirostat sampling mode for dynamic adjustments.
required
mirostat_tau
float
Tau parameter for mirostat sampling, controlling deviation.
required
mirostat_eta
float
Eta parameter for mirostat sampling, controlling adjustment speed.
required
model
Optional[str]
Specifies the model to use for generation.
required
logits_processor
Optional[List]
List of logits processors for advanced generation control.
required
grammar
Optional[Dict]
Specifies grammar rules for the generated text.
required
logit_bias
Optional[Dict[str, float]]
Adjustments to the logits of specified tokens.
required
logprobs
Optional[bool]
Whether to include log probabilities in the output.
required
top_logprobs
Optional[int]
Number of top log probabilities to include.
required
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the chat completion response or an error message.
Example CURL Request:
curl-XPOST"http://localhost:3000/api/v1/chat_llama_cpp"-H"Content-Type: application/json"-d'{ "messages": [ {"role": "user", "content": "What is the capital of France?"}, {"role": "system", "content": "The capital of France is"} ], "temperature": 0.2, "top_p": 0.95, "top_k": 40, "max_tokens": 50, }'
Handles POST requests to generate chat completions using the VLLM (Versatile Language Learning Model) engine.
This method accepts various parameters for customizing the chat completion request, including message content,
generation settings, and more.
Parameters:
Name
Type
Description
Default
messages
List[Dict[str, str]]
The chat messages for generating a response. Each message should include a 'role' (either 'user' or 'system') and 'content'.
required
temperature
float
The sampling temperature. Defaults to 0.7. Higher values generate more random completions.
required
top_p
float
The nucleus sampling probability. Defaults to 1.0. A smaller value leads to higher diversity.
required
n
int
The number of completions to generate. Defaults to 1.
required
max_tokens
int
The maximum number of tokens to generate. Controls the length of the generated response.
required
stop
Union[str, List[str]]
Sequence(s) where the generation should stop. Can be a single string or a list of strings.
required
stream
bool
Whether to stream the response. Streaming may be useful for long completions.
required
presence_penalty
float
Adjusts the likelihood of tokens based on their presence in the conversation so far. Defaults to 0.0.
required
frequency_penalty
float
Adjusts the likelihood of tokens based on their frequency in the conversation so far. Defaults to 0.0.
required
logit_bias
Dict[str, float]
Adjustments to the logits of specified tokens, identified by token IDs as keys and adjustment values as values.
required
user
str
An identifier for the user making the request. Can be used for logging or customization.
required
best_of
int
Generates 'n' completions server-side and returns the best one. Higher values incur more computation cost.
required
top_k
int
Filters the generated tokens to the top-k tokens with the highest probabilities. Defaults to -1, which disables top-k filtering.
required
ignore_eos
bool
Whether to ignore the end-of-sentence token in generation. Useful for more fluid continuations.
required
use_beam_search
bool
Whether to use beam search instead of sampling for generation. Beam search can produce more coherent results.
required
stop_token_ids
List[int]
List of token IDs that should cause generation to stop.
required
skip_special_tokens
bool
Whether to skip special tokens (like padding or end-of-sequence tokens) in the output.
required
spaces_between_special_tokens
bool
Whether to insert spaces between special tokens in the output.
required
add_generation_prompt
bool
Whether to prepend the generation prompt to the output.
required
echo
bool
Whether to include the input prompt in the output.
required
repetition_penalty
float
Penalty applied to tokens that have been generated previously. Defaults to 1.0, which applies no penalty.
required
min_p
float
Sets a minimum threshold for token probabilities. Tokens with probabilities below this threshold are filtered out.
required
include_stop_str_in_output
bool
Whether to include the stop string(s) in the output.
required
length_penalty
float
Exponential penalty to the length for beam search. Only relevant if use_beam_search is True.
required
Dict[str, Any]: A dictionary with the chat completion response or an error message.
Handles POST requests to generate text based on the given prompt and decoding strategy. It uses the pre-trained
model specified in the setup to generate a completion for the input prompt.
Args:
**kwargs (Any): Arbitrary keyword arguments containing the 'prompt' and other parameters for text generation.
Returns:
Dict[str, Any]: A dictionary containing the original prompt and the generated completion.
Example CURL Requests:
```bash
/usr/bin/curl -X POST localhost:3001/api/v1/complete -H "Content-Type: application/json" -d '{
"prompt": "<|system|>
<|end|>
<|user|>
How do I sort a list in Python?<|end|>
<|assistant|>",
"decoding_strategy": "generate",
"max_new_tokens": 100,
"do_sample": true,
"temperature": 0.7,
"top_k": 50,
"top_p": 0.95
}' | jq
```
LanguageModelAPI is a class for interacting with pre-trained language models to generate text. It allows for
customizable text generation via a CherryPy web server, handling requests and generating responses using
a specified language model. This class is part of the GeniusRise ecosystem for facilitating NLP tasks.
Attributes:
Name
Type
Description
model
Any
The loaded language model used for text generation.
tokenizer
Any
The tokenizer corresponding to the language model, used for processing input text.
Methods
complete(**kwargs: Any) -> Dict[str, Any]: Generates text based on provided prompts and model parameters.
Handles POST requests to generate text based on a given prompt and model-specific parameters. This method
is exposed as a web endpoint through CherryPy and returns a JSON response containing the original prompt,
the generated text, and any additional returned information from the model.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments containing the prompt, and any additional parameters
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary with the original prompt, generated text, and other model-specific information.
Example CURL Request:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/complete\-H"Content-Type: application/json"\-d'{ "prompt": "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\nWrite a PRD for Oauth auth using keycloak\n\n### Response:", "decoding_strategy": "generate", "max_new_tokens": 1024, "do_sample": true }'|jq
Handles POST requests to generate chat completions using the llama.cpp engine. This method accepts various
parameters for customizing the chat completion request, including messages, sampling settings, and more.
Parameters:
Name
Type
Description
Default
prompt
The prompt to generate text from.
required
suffix
A suffix to append to the generated text. If None, no suffix is appended.
required
max_tokens
The maximum number of tokens to generate. If max_tokens <= 0 or None, the maximum number of tokens to generate is unlimited and depends on n_ctx.
required
temperature
The temperature to use for sampling.
required
top_p
The top-p value to use for nucleus sampling. Nucleus sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
required
min_p
The min-p value to use for minimum p sampling. Minimum P sampling as described in https://github.com/ggerganov/llama.cpp/pull/3841
required
typical_p
The typical-p value to use for sampling. Locally Typical Sampling implementation described in the paper https://arxiv.org/abs/2202.00666.
required
logprobs
The number of logprobs to return. If None, no logprobs are returned.
required
echo
Whether to echo the prompt.
required
stop
A list of strings to stop generation when encountered.
required
frequency_penalty
The penalty to apply to tokens based on their frequency in the prompt.
required
presence_penalty
The penalty to apply to tokens based on their presence in the prompt.
required
repeat_penalty
The penalty to apply to repeated tokens.
required
top_k
The top-k value to use for sampling. Top-K sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
required
stream
Whether to stream the results.
required
seed
The seed to use for sampling.
required
tfs_z
The tail-free sampling parameter. Tail Free Sampling described in https://www.trentonbricken.com/Tail-Free-Sampling/.
required
mirostat_mode
The mirostat sampling mode.
required
mirostat_tau
The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
required
mirostat_eta
The learning rate used to update mu based on the error between the target and observed surprisal of the sampled word. A larger learning rate will cause mu to be updated more quickly, while a smaller learning rate will result in slower updates.
required
model
The name to use for the model in the completion object.
required
stopping_criteria
A list of stopping criteria to use.
required
logits_processor
A list of logits processors to use.
required
grammar
A grammar to use for constrained sampling.
required
logit_bias
A logit bias to use.
required
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the chat completion response or an error message.
Example CURL Request:
curl-XPOST"http://localhost:3001/api/v1/complete_llama_cpp"-H"Content-Type: application/json"-d'{ "prompt": "Whats the weather like in London?", "temperature": 0.7, "top_p": 0.95, "top_k": 40, "max_tokens": 50, "repeat_penalty": 1.1 }'
Handles POST requests to generate chat completions using the VLLM (Versatile Language Learning Model) engine.
This method accepts various parameters for customizing the chat completion request, including message content,
generation settings, and more.
**kwargs (Any): Arbitrary keyword arguments. Expects data in JSON format containing any of the following keys:
messages (Union[str, List[Dict[str, str]]]): The messages for the chat context.
temperature (float, optional): The sampling temperature. Defaults to 0.7.
top_p (float, optional): The nucleus sampling probability. Defaults to 1.0.
n (int, optional): The number of completions to generate. Defaults to 1.
max_tokens (int, optional): The maximum number of tokens to generate.
stop (Union[str, List[str]], optional): Stop sequence to end generation.
stream (bool, optional): Whether to stream the response. Defaults to False.
presence_penalty (float, optional): The presence penalty. Defaults to 0.0.
frequency_penalty (float, optional): The frequency penalty. Defaults to 0.0.
logit_bias (Dict[str, float], optional): Adjustments to the logits of specified tokens.
user (str, optional): An identifier for the user making the request.
(Additional model-specific parameters)
Dict[str, Any]: A dictionary with the chat completion response or an error message.
Example CURL Request:
curl-v-XPOST"http://localhost:3000/api/v1/complete_vllm"-H"Content-Type: application/json"-u"user:password"-d'{ "messages": ["Whats the weather like in London?"], "temperature": 0.7, "top_p": 1.0, "n": 1, "max_tokens": 50, "stream": false, "presence_penalty": 0.0, "frequency_penalty": 0.0, "logit_bias": {}, "user": "example_user" }'
This request asks the VLLM engine to generate a completion for the provided chat context, with specified generation settings.
NamedEntityRecognitionAPI serves a Named Entity Recognition (NER) model using the Hugging Face transformers library.
It is designed to recognize and classify named entities in text into predefined categories such as the names of persons,
organizations, locations, expressions of times, quantities, monetary values, percentages, etc.
Attributes:
Name
Type
Description
model
Any
The loaded NER model, typically a Hugging Face transformer model specialized for token classification.
tokenizer
Any
The tokenizer for preprocessing text compatible with the loaded model.
Recognizes named entities in the input text using the Hugging Face pipeline.
This method leverages a pre-trained NER model to identify and classify entities in text into categories such as
names, organizations, locations, etc. It's suitable for processing various types of text content.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments, typically containing 'text' for the input text.
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the original input text and a list of recognized entities.
Example CURL Request for NER:
curl-XPOSTlocalhost:3000/api/v1/ner_pipeline-H"Content-Type: application/json"-d'{"text": "John Doe works at OpenAI in San Francisco."}'|jq
Endpoint for recognizing named entities in the input text using the loaded NER model.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments, typically containing 'text' for the input text.
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the original input text and a list of recognized entities
with their respective types.
Example CURL Requests:
curl-XPOSTlocalhost:3000/api/v1/recognize_entities\-H"Content-Type: application/json"\-d'{"text": "John Doe works at OpenAI in San Francisco."}'|jq
curl-XPOSTlocalhost:3000/api/v1/recognize_entities\-H"Content-Type: application/json"\-d'{"text": "Alice is going to visit the Eiffel Tower in Paris next summer."}'|jq
A class for handling different types of QA models, including traditional QA, TAPAS (Table-based QA), and TAPEX.
It utilizes the Hugging Face transformers library to provide state-of-the-art question answering capabilities
across various formats of data including plain text and tabular data.
Answers questions based on the provided context (text or table). It adapts to the model type (traditional, TAPAS, TAPEX)
and provides answers accordingly.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments, typically containing the 'question' and 'data' (context or table).
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the question, context/table, and answer(s).
Example CURL Request for Text-based QA:
curl-XPOSTlocalhost:3000/api/v1/answer\-H"Content-Type: application/json"\-d'{"question": "What is the capital of France?", "data": "France is a country in Europe. Its capital is Paris."}'
Example CURL Requests:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/answer\-H"Content-Type: application/json"\-d'{ "data": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.", "question": "What is the common wisdom about RNNs?" }'|jq
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/answer\-H"Content-Type: application/json"\-d'{ "data": [ {"Name": "Alice", "Age": "30"}, {"Name": "Bob", "Age": "25"} ], "question": "what is their total age?"}'|jq
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/answer\-H"Content-Type: application/json"\-d'{ "data": {"Actors": ["Brad Pitt", "Leonardo Di Caprio", "George Clooney"], "Number of movies": ["87", "53", "69"]}, "question": "how many movies does Leonardo Di Caprio have?"}'|jq
Answers questions using the Hugging Face pipeline based on the provided context.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments, typically containing 'question' and 'data'.
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the question, context, and the answer.
Example CURL Request for QA:
curl-XPOSTlocalhost:3000/api/v1/answer_pipeline-H"Content-Type: application/json"-d'{"question": "Who is the CEO of Tesla?", "data": "Elon Musk is the CEO of Tesla."}'
A class for serving a Hugging Face-based summarization model. This API provides an interface to
submit text and receive a summarized version, utilizing state-of-the-art machine learning models for
text summarization.
Attributes:
Name
Type
Description
model
AutoModelForSeq2SeqLM
The loaded Hugging Face model for summarization.
tokenizer
AutoTokenizer
The tokenizer for preprocessing text.
Methods
summarize(self, **kwargs: Any) -> Dict[str, Any]:
Summarizes the input text based on the given parameters.
Summarizes the input text based on the given parameters using a machine learning model. The method
accepts parameters via a POST request and returns the summarized text.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments. Expected to receive these from the POST request's JSON body.
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the input text and its summary.
Example CURL Requests:
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/summarize\-H"Content-Type: application/json"\-d'{ "text": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.", "decoding_strategy": "generate", "bos_token_id": 0, "decoder_start_token_id": 2, "early_stopping": true, "eos_token_id": 2, "forced_bos_token_id": 0, "forced_eos_token_id": 2, "length_penalty": 2.0, "max_length": 142, "min_length": 56, "no_repeat_ngram_size": 3, "num_beams": 4, "pad_token_id": 1, "do_sample": false }'|jq
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/summarize\-H"Content-Type: application/json"\-d'{ "text": "Theres something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience Ive in fact reached the opposite conclusion). Fast forward about a year: Im training RNNs all the time and Ive witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me.", "decoding_strategy": "generate", "early_stopping": true, "length_penalty": 2.0, "max_length": 142, "min_length": 56, "no_repeat_ngram_size": 3, "num_beams": 4 }'|jq
Summarizes the input text using the Hugging Face pipeline based on given parameters.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Keyword arguments containing parameters for summarization.
{}
Returns:
Type
Description
Dict[str, Any]
A dictionary containing the input text and its summary.
Example CURL Request for summarization:
curl -X POST localhost:3000/api/v1/summarize_pipeline -H "Content-Type: application/json" -d '{"text": "Your long text here"}'
A class for serving a Hugging Face-based translation model as a web API.
This API allows users to submit text for translation and receive translated text
in the specified target language using advanced machine learning models.
Parameters:
Name
Type
Description
Default
input
BatchInput
Configurations and data inputs for the batch process.
required
output
BatchOutput
Configurations for output data handling.
required
state
State
State management for the translation task.
required
**kwargs
Any
Additional keyword arguments for extended configurations.
Translates text to a specified target language using the underlying Hugging Face model.
This endpoint accepts JSON data with the text and language details,
processes it through the machine learning model, and returns the translated text.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments, usually empty as parameters are in the POST body.
{}
POST body parameters
text (str): The text to be translated.
decoding_strategy (str): Strategy to use for decoding text; e.g., 'beam_search', 'greedy'. Default is 'generate'.
source_lang (str): Source language code.
target_lang (str): Target language code. Default is 'en'.
additional_params (dict): Other model-specific parameters for translation.
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary with the original text, target language, and translated text.
/usr/bin/curl-XPOSTlocalhost:3000/api/v1/translate\-H"Content-Type: application/json"\-d'{ "text": "संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है", "source_lang": "hi_IN", "target_lang": "en_XX", "decoding_strategy": "generate", "decoder_start_token_id": 2, "early_stopping": true, "eos_token_id": 2, "forced_eos_token_id": 2, "max_length": 200, "num_beams": 5, "pad_token_id": 1 }'|jq
Endpoint for translating text using a pre-initialized Hugging Face translation pipeline.
This method is designed to handle translation requests more efficiently by utilizing
a preloaded model and tokenizer, reducing the overhead of loading these components for each request.
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the original text, source language, target language,
and the translated text.
The VisionAPI class inherits from VisionBulk and is designed to facilitate
the handling of vision-based tasks using a pre-trained machine learning model.
It sets up a server to process image-related requests using a specified model.
ImageClassificationAPI extends the VisionAPI for image classification tasks. This API provides functionalities
to classify images into various categories based on the trained model it uses. It supports both single-label
and multi-label classification problems.
Methods
classify_image(self): Endpoint to classify an uploaded image and return the classification scores.
sigmoid(self, _outputs): Applies the sigmoid function to the model's outputs.
softmax(self, _outputs): Applies the softmax function to the model's outputs.
Initializes the ImageClassificationAPI with the necessary configurations for input, output, and state management,
along with model-specific parameters.
Parameters:
Name
Type
Description
Default
input
BatchInput
Configuration for the input data.
required
output
BatchOutput
Configuration for the output data.
required
state
State
State management for the API.
required
**kwargs
Additional keyword arguments for extended functionality, such as model configuration.
Endpoint for classifying an image. It accepts a base64-encoded image, decodes it, preprocesses it, and
runs it through the classification model. It supports both single-label and multi-label classification
by applying the appropriate post-processing function to the model outputs.
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the predictions with the highest scores and all prediction scores.
Dict[str, Any]
Each prediction includes the label and its corresponding score.
Raises:
Type
Description
Exception
If an error occurs during image processing or classification.
VisionSegmentationAPI extends VisionAPI to provide image segmentation functionalities, including panoptic,
instance, and semantic segmentation. This API supports different segmentation tasks based on the model's
capabilities and the specified subtask in the request.
Methods
segment_image(self): Processes an image for segmentation and returns the segmentation masks along with labels.
Initializes the VisionSegmentationAPI with configurations for input, output, and state management, along
with any model-specific parameters for segmentation tasks.
Parameters:
Name
Type
Description
Default
input
BatchInput
Configuration for the input data.
required
output
BatchOutput
Configuration for the output data.
required
state
State
State management for the API.
required
**kwargs
Additional keyword arguments for extended functionality.
Endpoint for segmenting an image according to the specified subtask (panoptic, instance, or semantic segmentation).
It decodes the base64-encoded image, processes it through the model, and returns the segmentation masks along with
labels and scores (if applicable) in base64 format.
The method supports dynamic task inputs for models requiring specific task descriptions and applies different
post-processing techniques based on the subtask.
Returns:
Type
Description
List[Dict[str, Any]]
List[Dict[str, Any]]: A list of dictionaries where each dictionary contains a 'label', a 'score' (if applicable),
List[Dict[str, Any]]
and a 'mask' (base64-encoded image of the segmentation mask).
Raises:
Type
Description
Exception
If an error occurs during image processing or segmentation.
ImageOCRAPI provides Optical Character Recognition (OCR) capabilities for images, leveraging different OCR engines
like EasyOCR, PaddleOCR, and Hugging Face models tailored for OCR tasks. This API can decode base64-encoded images,
process them through the chosen OCR engine, and return the recognized text.
The API supports dynamic selection of OCR engines and configurations based on the provided model name and arguments,
offering flexibility in processing various languages and image types.
Methods
ocr(self): Processes an uploaded image for OCR and returns the recognized text.
Endpoint for performing OCR on an uploaded image. It accepts a base64-encoded image, decodes it, preprocesses
it through the specified OCR model, and returns the recognized text.
Returns:
Type
Description
Dict[str, Any]: A dictionary containing the success status, recognized text ('result'), and the original
image name ('image_name') if provided.
Raises:
Type
Description
Exception
If an error occurs during image processing or OCR.
Processes the image using a Hugging Face model specified for OCR tasks. Supports advanced configurations
and post-processing to handle various OCR-related challenges.
Parameters:
Name
Type
Description
Default
image
Image.Image
The image to process.
required
use_easyocr_bbox
bool
Whether to use EasyOCR to detect text bounding boxes before processing with
Hugging Face models.
VisualQAAPI extends VisionAPI to provide an interface for visual question answering (VQA) tasks. This API supports
answering questions about an image by utilizing deep learning models specifically trained for VQA. It processes
requests containing an image and a question about the image, performs inference using the loaded model, and returns
the predicted answer.
Methods
answer_question(self): Receives an image and a question, returns the answer based on visual content.
Initializes the VisualQAAPI with configurations for input, output, state management, and any model-specific
parameters for visual question answering tasks.
Parameters:
Name
Type
Description
Default
input
BatchInput
Configuration for the input data.
required
output
BatchOutput
Configuration for the output data.
required
state
State
State management for the API.
required
**kwargs
Additional keyword arguments for extended functionality.
Endpoint for receiving an image with a question and returning the answer based on the visual content of the image.
It processes the request containing a base64-encoded image and a question string, and utilizes the loaded model
to predict the answer to the question related to the image.
Returns:
Type
Description
Dict[str, Any]: A dictionary containing the original question and the predicted answer.
Raises:
Type
Description
ValueError
If required fields 'image_base64' and 'question' are not provided in the request.
Exception
If an error occurs during image processing or inference.
Example CURL Request:
curl-XPOSTlocalhost:3000/api/v1/answer_question-H"Content-Type: application/json"-d'{"image_base64": "<base64-encoded-image>", "question": "What is the color of the sky in the image?"}'
or
(base64-w0test_images_segment_finetune/image1.jpg|awk'{print "{"image_base64": ""$0"", "question": "how many cats are there?"}"}'>/tmp/image_payload.json)curl-XPOSThttp://localhost:3000/api/v1/answer_question-H"Content-Type: application/json"-uuser:password-d@/tmp/image_payload.json|jq
Starts a CherryPy server to listen for requests to generate text.
Parameters:
Name
Type
Description
Default
model_name
str
The name of the pre-trained language model.
required
model_class
str
The name of the class of the pre-trained language model. Defaults to "AutoModelForCausalLM".
'AutoModel'
processor_class
str
The name of the class of the processor used to preprocess input text. Defaults to "AutoProcessor".
'AutoProcessor'
use_cuda
bool
Whether to use a GPU for inference. Defaults to False.
False
precision
str
The precision to use for the pre-trained language model. Defaults to "float16".
'float16'
quantization
int
The level of quantization to use for the pre-trained language model. Defaults to 0.
0
device_map
str | Dict | None
The mapping of devices to use for inference. Defaults to "auto".
'auto'
max_memory
Dict[int, str]
The maximum memory to use for inference. Defaults to {0: "24GB"}.
{0: '24GB'}
torchscript
bool
Whether to use a TorchScript-optimized version of the pre-trained language model. Defaults to True.
False
compile
bool
Enable Torch JIT compilation.
False
concurrent_queries
bool
(bool): Whether the API supports concurrent API calls (usually false).
False
use_whisper_cpp
bool
Whether to use whisper.cpp to load the model. Defaults to False. Note: only works for these models: https://github.com/aarnphm/whispercpp/blob/524dd6f34e9d18137085fb92a42f1c31c9c6bc29/src/whispercpp/utils.py#L32
False
use_faster_whisper
bool
Whether to use faster-whisper.
False
endpoint
str
The endpoint to listen on. Defaults to "*".
'*'
port
int
The port to listen on. Defaults to 3000.
3000
cors_domain
str
The domain to allow CORS requests from. Defaults to "http://localhost:3000".
'http://localhost:3000'
username
Optional[str]
The username to use for authentication. Defaults to None.
None
password
Optional[str]
The password to use for authentication. Defaults to None.
None
**model_args
Any
Additional arguments to pass to the pre-trained language model.
API endpoint to convert text input to speech using the text-to-speech model.
Expects a JSON input with 'text' as a key containing the text to be synthesized.
Returns:
Type
Description
Dict[str, str]: A dictionary containing the base64 encoded audio data.
Example CURL Request for synthesis:
... [Provide example CURL request] ...
SpeechToTextAPI is a subclass of AudioAPI specifically designed for speech-to-text models.
It extends the functionality to handle speech-to-text processing using various ASR models.
Attributes:
Name
Type
Description
model
AutoModelForCTC
The speech-to-text model.
processor
AutoProcessor
The processor to prepare input audio data for the model.
Methods
transcribe(audio_input: bytes) -> str:
Transcribes the given audio input to text using the speech-to-text model.
Recognizes named entities in the input text using the Hugging Face pipeline.
This method leverages a pre-trained NER model to identify and classify entities in text into categories such as
names, organizations, locations, etc. It's suitable for processing various types of text content.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments, typically containing 'text' for the input text.
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the original input text and a list of recognized entities.
API endpoint to transcribe the given audio input to text using the speech-to-text model.
Expects a JSON input with 'audio_file' as a key containing the base64 encoded audio data.
Returns:
Type
Description
Dict[str, str]: A dictionary containing the transcribed text.
TextBulk is a foundational class for enabling bulk processing of text with various generation models.
It primarily focuses on using Hugging Face models to provide a robust and efficient framework for
large-scale text generation tasks. The class supports various decoding strategies to generate text
that can be tailored to specific needs or preferences.
Attributes:
Name
Type
Description
model
AutoModelForCausalLM
The language model for text generation.
tokenizer
AutoTokenizer
The tokenizer for preparing input data for the model.
Parameters:
Name
Type
Description
Default
input
BatchInput
Configuration and data inputs for the batch process.
required
output
BatchOutput
Configurations for output data handling.
required
state
State
State management for the Bolt.
required
**kwargs
Arbitrary keyword arguments for extended configurations.
{}
Methods
text(**kwargs: Any) -> Dict[str, Any]:
Provides an API endpoint for text generation functionality.
Accepts various parameters for customizing the text generation process.
generate(prompt: str, decoding_strategy: str = "generate", **generation_params: Any) -> dict:
Generates text based on the provided prompt and parameters. Supports multiple decoding strategies for diverse applications.
The class serves as a versatile tool for text generation, supporting various models and configurations.
It can be extended or used as is for efficient text generation tasks.
TextClassificationBulk is designed to handle bulk text classification tasks using Hugging Face models efficiently and
effectively. It allows for processing large datasets, utilizing state-of-the-art machine learning models to provide
accurate classification of text data into predefined labels.
Parameters:
Name
Type
Description
Default
input
BatchInput
Configuration and data inputs for the batch process.
required
output
BatchOutput
Configurations for output data handling.
required
state
State
State management for the classification task.
required
**kwargs
Arbitrary keyword arguments for extended configurations.
Perform bulk classification using the specified model and tokenizer. This method handles the entire classification
process including loading the model, processing input data, predicting classifications, and saving the results.
Parameters:
Name
Type
Description
Default
model_name
str
Name or path of the model.
required
model_class
str
Class name of the model (default "AutoModelForSequenceClassification").
'AutoModelForSequenceClassification'
tokenizer_class
str
Class name of the tokenizer (default "AutoTokenizer").
'AutoTokenizer'
use_cuda
bool
Whether to use CUDA for model inference (default False).
False
precision
str
Precision for model computation (default "float").
'float'
quantization
int
Level of quantization for optimizing model size and speed (default 0).
0
device_map
str | Dict | None
Specific device to use for computation (default "auto").
'auto'
max_memory
Dict
Maximum memory configuration for devices.
{0: '24GB'}
torchscript
bool
Whether to use a TorchScript-optimized version of the pre-trained language model. Defaults to False.
False
compile
bool
Whether to compile the model before fine-tuning. Defaults to True.
False
awq_enabled
bool
Whether to enable AWQ optimization (default False).
False
flash_attention
bool
Whether to use flash attention optimization (default False).
False
batch_size
int
Number of classifications to process simultaneously (default 32).
32
**kwargs
Any
Arbitrary keyword arguments for model and generation configurations.
The NLIBulk class provides functionality for large-scale natural language inference (NLI) processing using Hugging Face
transformers. It allows users to load datasets, configure models, and perform inference on batches of premise-hypothesis pairs.
Attributes:
Name
Type
Description
input
BatchInput
Configuration and data inputs for the batch process.
Performs NLI inference on a loaded dataset using the specified model. The method processes the data in batches and saves
the results to the configured output path.
Parameters:
Name
Type
Description
Default
model_name
str
Name or path of the NLI model.
required
max_length
int
Maximum length of the sequences for tokenization purposes. Defaults to 512.
512
model_class
str
Class name of the model (e.g., "AutoModelForSequenceClassification"). Defaults to "AutoModelForSeq2SeqLM".
'AutoModelForSeq2SeqLM'
tokenizer_class
str
Class name of the tokenizer (e.g., "AutoTokenizer"). Defaults to "AutoTokenizer".
'AutoTokenizer'
use_cuda
bool
Whether to use CUDA for model inference. Defaults to False.
False
precision
str
Precision for model computation (e.g., "float16"). Defaults to "float16".
'float16'
quantization
int
Level of quantization for optimizing model size and speed. Defaults to 0.
0
device_map
str | Dict | None
Specific device to use for computation. Defaults to "auto".
'auto'
max_memory
Dict
Maximum memory configuration for devices. Defaults to {0: "24GB"}.
{0: '24GB'}
torchscript
bool
Whether to use a TorchScript-optimized version of the pre-trained language model. Defaults to False.
False
compile
bool
Whether to compile the model before fine-tuning. Defaults to True.
False
awq_enabled
bool
Whether to enable AWQ optimization. Defaults to False.
False
flash_attention
bool
Whether to use flash attention optimization. Defaults to False.
False
batch_size
int
Number of premise-hypothesis pairs to process simultaneously. Defaults to 32.
32
**kwargs
Any
Arbitrary keyword arguments for model and generation configurations.
InstructionBulk is a class designed to perform bulk text generation tasks using Hugging Face's instruction-tuned language models.
It is optimized for large-scale text generation, providing an efficient interface to use state-of-the-art machine learning
models for generating text based on a set of instructions or prompts.
Attributes:
Name
Type
Description
model
Any
The loaded, pre-trained instruction-tuned language model.
tokenizer
Any
The tokenizer for processing text compatible with the model.
Methods
load_dataset(dataset_path: str, max_length: int = 1024, **kwargs) -> Optional[Dataset]:
Loads a dataset for text generation tasks from the specified directory.
perform(model_name: str, **kwargs: Any) -> None:
Performs bulk text generation using the specified model and tokenizer.
Loads a dataset from the specified path. This method supports various data formats including JSON, CSV, Parquet,
and others. It's designed to facilitate the bulk processing of text data for generation tasks.
Parameters:
Name
Type
Description
Default
dataset_path
str
Path to the directory containing the dataset files.
required
max_length
int
Maximum token length for text processing (default is 1024).
1024
**kwargs
Additional keyword arguments for dataset loading.
{}
Returns:
Type
Description
Optional[Dataset]
Optional[Dataset]: A Dataset object if loading is successful; otherwise, None.
Performs text generation in bulk using a specified instruction-tuned model. This method handles the entire
process, including model loading, prompt processing, text generation, and saving the results.
Parameters:
Name
Type
Description
Default
model_name
str
The name or path of the instruction-tuned model.
required
model_class
str
The class of the language model. Defaults to "AutoModelForCausalLM".
'AutoModelForCausalLM'
tokenizer_class
str
The class of the tokenizer. Defaults to "AutoTokenizer".
'AutoTokenizer'
use_cuda
bool
Whether to use CUDA for model inference. Defaults to False.
False
precision
str
Precision for model computation. Defaults to "float16".
'float16'
quantization
int
Level of quantization for optimizing model size and speed. Defaults to 0.
0
device_map
str | Dict | None
Specific device to use for computation. Defaults to "auto".
'auto'
max_memory
Dict
Maximum memory configuration for devices. Defaults to {0: "24GB"}.
{0: '24GB'}
torchscript
bool
Whether to use a TorchScript-optimized version of the pre-trained language model. Defaults to False.
False
compile
bool
Whether to compile the model before fine-tuning. Defaults to True.
False
awq_enabled
bool
Whether to enable AWQ optimization. Defaults to False.
False
flash_attention
bool
Whether to use flash attention optimization. Defaults to False.
False
decoding_strategy
str
Strategy for decoding the completion. Defaults to "generate".
'generate'
**kwargs
Any
Configuration and additional arguments for text generation such as model class, tokenizer class,
precision, device map, and other generation-related parameters.
{}
Note
Additional arguments are passed directly to the model and tokenizer initialization and the generation method.
Performs bulk text generation using the LLaMA model with llama.cpp backend. This method handles the entire
process, including model loading, prompt processing, text generation, and saving the results.
Parameters:
Name
Type
Description
Default
model
str
Path or identifier for the LLaMA model.
required
filename
Optional[str]
Optional filename or glob pattern to match the model file.
Performs bulk text generation using the Versatile Language Learning Model (VLLM) with specified parameters
for fine-tuning model behavior, including quantization and parallel processing settings. This method is designed
to process large datasets efficiently by leveraging VLLM capabilities for generating high-quality text completions
based on provided prompts.
Parameters:
Name
Type
Description
Default
model_name
str
The name or path of the VLLM model to use for text generation.
required
use_cuda
bool
Flag indicating whether to use CUDA for GPU acceleration.
False
precision
str
Precision of computations, can be "float16", "bfloat16", etc.
'float16'
quantization
int
Level of quantization for model weights, 0 for none.
0
device_map
str | Dict | None
Specific device(s) to use for model inference.
'auto'
vllm_tokenizer_mode
str
Mode of the tokenizer ("auto", "fast", or "slow").
'auto'
vllm_download_dir
Optional[str]
Directory to download and load the model and tokenizer.
None
vllm_load_format
str
Format to load the model, e.g., "auto", "pt".
'auto'
vllm_seed
int
Seed for random number generation.
42
vllm_max_model_len
int
Maximum sequence length the model can handle.
1024
vllm_enforce_eager
bool
Enforce eager execution instead of using optimization techniques.
False
vllm_max_context_len_to_capture
int
Maximum context length for CUDA graph capture.
8192
vllm_block_size
int
Block size for caching mechanism.
16
vllm_gpu_memory_utilization
float
Fraction of GPU memory to use.
0.9
vllm_swap_space
int
Amount of swap space to use in GiB.
4
vllm_sliding_window
Optional[int]
Size of the sliding window for processing.
None
vllm_pipeline_parallel_size
int
Number of pipeline parallel groups.
1
vllm_tensor_parallel_size
int
Number of tensor parallel groups.
1
vllm_worker_use_ray
bool
Whether to use Ray for model workers.
False
vllm_max_parallel_loading_workers
Optional[int]
Maximum number of workers for parallel loading.
None
vllm_disable_custom_all_reduce
bool
Disable custom all-reduce kernel and fall back to NCCL.
False
vllm_max_num_batched_tokens
Optional[int]
Maximum number of tokens to be processed in a single iteration.
None
vllm_max_num_seqs
int
Maximum number of sequences to be processed in a single iteration.
64
vllm_max_paddings
int
Maximum number of paddings to be added to a batch.
512
vllm_max_lora_rank
Optional[int]
Maximum rank for LoRA adjustments.
None
vllm_max_loras
Optional[int]
Maximum number of LoRA adjustments.
None
vllm_max_cpu_loras
Optional[int]
Maximum number of LoRA adjustments stored on CPU.
None
vllm_lora_extra_vocab_size
int
Additional vocabulary size for LoRA.
0
vllm_placement_group
Optional[dict]
Ray placement group for distributed execution.
None
vllm_log_stats
bool
Whether to log statistics during model operation.
False
notification_email
Optional[str]
Email to send notifications upon completion.
None
batch_size
int
Number of prompts to process in each batch for efficient memory usage.
32
**kwargs
Any
Additional keyword arguments for generation settings like temperature, top_p, etc.
{}
This method automates the loading of large datasets, generation of text completions, and saving results,
facilitating efficient and scalable text generation tasks.
LanguageModelBulk is designed for large-scale text generation using Hugging Face language models in a bulk processing
manner. It's particularly useful for tasks such as bulk content creation, summarization, or any other scenario where
large datasets need to be processed with a language model.
Attributes:
Name
Type
Description
model
Any
The loaded language model used for text generation.
tokenizer
Any
The tokenizer corresponding to the language model, used for processing input text.
Parameters:
Name
Type
Description
Default
input
BatchInput
Configuration for the input data.
required
output
BatchOutput
Configuration for the output data.
required
state
State
State management for the API.
required
**kwargs
Any
Arbitrary keyword arguments for extended functionality.
Performs text completion on the loaded dataset using the specified model and tokenizer. The method handles the
entire process, including model loading, text generation, and saving the results.
Parameters:
Name
Type
Description
Default
model_name
str
The name of the language model to use for text completion.
required
model_class
str
The class of the language model. Defaults to "AutoModelForCausalLM".
'AutoModelForCausalLM'
tokenizer_class
str
The class of the tokenizer. Defaults to "AutoTokenizer".
'AutoTokenizer'
use_cuda
bool
Whether to use CUDA for model inference. Defaults to False.
False
precision
str
Precision for model computation. Defaults to "float16".
'float16'
quantization
int
Level of quantization for optimizing model size and speed. Defaults to 0.
0
device_map
str | Dict | None
Specific device to use for computation. Defaults to "auto".
'auto'
max_memory
Dict
Maximum memory configuration for devices. Defaults to {0: "24GB"}.
{0: '24GB'}
torchscript
bool
Whether to use a TorchScript-optimized version of the pre-trained language model. Defaults to False.
False
compile
bool
Whether to compile the model before fine-tuning. Defaults to True.
False
awq_enabled
bool
Whether to enable AWQ optimization. Defaults to False.
False
flash_attention
bool
Whether to use flash attention optimization. Defaults to False.
False
decoding_strategy
str
Strategy for decoding the completion. Defaults to "generate".
Performs bulk text generation using the LLaMA model with llama.cpp backend. This method handles the entire
process, including model loading, prompt processing, text generation, and saving the results.
Parameters:
Name
Type
Description
Default
model
str
Path or identifier for the LLaMA model.
required
filename
Optional[str]
Optional filename or glob pattern to match the model file.
Performs bulk text generation using the Versatile Language Learning Model (VLLM) with specified parameters
for fine-tuning model behavior, including quantization and parallel processing settings. This method is designed
to process large datasets efficiently by leveraging VLLM capabilities for generating high-quality text completions
based on provided prompts.
Parameters:
Name
Type
Description
Default
model_name
str
The name or path of the VLLM model to use for text generation.
required
use_cuda
bool
Flag indicating whether to use CUDA for GPU acceleration.
False
precision
str
Precision of computations, can be "float16", "bfloat16", etc.
'float16'
quantization
int
Level of quantization for model weights, 0 for none.
0
device_map
str | Dict | None
Specific device(s) to use for model inference.
'auto'
vllm_tokenizer_mode
str
Mode of the tokenizer ("auto", "fast", or "slow").
'auto'
vllm_download_dir
Optional[str]
Directory to download and load the model and tokenizer.
None
vllm_load_format
str
Format to load the model, e.g., "auto", "pt".
'auto'
vllm_seed
int
Seed for random number generation.
42
vllm_max_model_len
int
Maximum sequence length the model can handle.
1024
vllm_enforce_eager
bool
Enforce eager execution instead of using optimization techniques.
False
vllm_max_context_len_to_capture
int
Maximum context length for CUDA graph capture.
8192
vllm_block_size
int
Block size for caching mechanism.
16
vllm_gpu_memory_utilization
float
Fraction of GPU memory to use.
0.9
vllm_swap_space
int
Amount of swap space to use in GiB.
4
vllm_sliding_window
Optional[int]
Size of the sliding window for processing.
None
vllm_pipeline_parallel_size
int
Number of pipeline parallel groups.
1
vllm_tensor_parallel_size
int
Number of tensor parallel groups.
1
vllm_worker_use_ray
bool
Whether to use Ray for model workers.
False
vllm_max_parallel_loading_workers
Optional[int]
Maximum number of workers for parallel loading.
None
vllm_disable_custom_all_reduce
bool
Disable custom all-reduce kernel and fall back to NCCL.
False
vllm_max_num_batched_tokens
Optional[int]
Maximum number of tokens to be processed in a single iteration.
None
vllm_max_num_seqs
int
Maximum number of sequences to be processed in a single iteration.
64
vllm_max_paddings
int
Maximum number of paddings to be added to a batch.
512
vllm_max_lora_rank
Optional[int]
Maximum rank for LoRA adjustments.
None
vllm_max_loras
Optional[int]
Maximum number of LoRA adjustments.
None
vllm_max_cpu_loras
Optional[int]
Maximum number of LoRA adjustments stored on CPU.
None
vllm_lora_extra_vocab_size
int
Additional vocabulary size for LoRA.
0
vllm_placement_group
Optional[dict]
Ray placement group for distributed execution.
None
vllm_log_stats
bool
Whether to log statistics during model operation.
False
notification_email
Optional[str]
Email to send notifications upon completion.
None
batch_size
int
Number of prompts to process in each batch for efficient memory usage.
32
**kwargs
Any
Additional keyword arguments for generation settings like temperature, top_p, etc.
{}
This method automates the loading of large datasets, generation of text completions, and saving results,
facilitating efficient and scalable text generation tasks.
NamedEntityRecognitionBulk is a class designed for bulk processing of Named Entity Recognition (NER) tasks.
It leverages state-of-the-art NER models from Hugging Face's transformers library to identify and classify entities
such as person names, locations, organizations, and other types of entities from a large corpus of text.
This class provides functionalities to load large datasets, configure NER models, and perform entity recognition
in bulk, making it suitable for processing large volumes of text data efficiently.
Attributes:
Name
Type
Description
model
Any
The NER model loaded for entity recognition tasks.
tokenizer
Any
The tokenizer used for text pre-processing in alignment with the model.
Initializes the NamedEntityRecognitionBulk class with specified input, output, and state configurations.
Sets up the NER model and tokenizer for bulk entity recognition tasks.
Parameters:
Name
Type
Description
Default
input
BatchInput
The input data configuration.
required
output
BatchOutput
The output data configuration.
required
state
State
The state management for the API.
required
**kwargs
Any
Additional keyword arguments for extended functionality.
Loads a dataset from the specified directory path. The method supports various data formats and structures,
ensuring that the dataset is properly formatted for NER tasks.
Parameters:
Name
Type
Description
Default
dataset_path
str
The path to the dataset directory.
required
**kwargs
Any
Additional keyword arguments to handle specific dataset loading scenarios.
{}
Returns:
Type
Description
Optional[Dataset]
Optional[Dataset]: The loaded dataset or None if an error occurs during loading.
QABulk is a class designed for managing bulk question-answering tasks using Hugging Face models. It is
capable of handling both traditional text-based QA and table-based QA (using TAPAS and TAPEX models),
providing a versatile solution for automated question answering at scale.
Parameters:
Name
Type
Description
Default
input
BatchInput
Configuration and data inputs for batch processing.
required
output
BatchOutput
Configurations for output data handling.
required
state
State
State management for the bulk QA task.
required
**kwargs
Arbitrary keyword arguments for extended functionality.
{}
Example CLI Usage:
# For traditional text-based QA:geniusQABulkrise\batch\--input_s3_bucketgeniusrise-test\--input_s3_folderinput/qa-traditional\batch\--output_s3_bucketgeniusrise-test\--output_s3_folderoutput/qa-traditional\postgres\--postgres_host127.0.0.1\--postgres_port5432\--postgres_userpostgres\--postgres_passwordpostgres\--postgres_databasegeniusrise\--postgres_tablestate\--iddistilbert-base-uncased-distilled-squad-lol\answer_questions\--args\model_name="distilbert-base-uncased-distilled-squad"\model_class="AutoModelForQuestionAnswering"\tokenizer_class="AutoTokenizer"\use_cuda=True\precision="bfloat16"\quantization=0\device_map="cuda:0"\max_memory=None\torchscript=False
# For table-based QA using TAPAS:geniusQABulkrise\batch\--input_s3_bucketgeniusrise-test\--input_s3_folderinput/qa-table\batch\--output_s3_bucketgeniusrise-test\--output_s3_folderoutput/qa-table\postgres\--postgres_host127.0.0.1\--postgres_port5432\--postgres_userpostgres\--postgres_passwordpostgres\--postgres_databasegeniusrise\--postgres_tablestate\--idgoogle/tapas-base-finetuned-wtq-lol\answer_questions\--args\model_name="google/tapas-base-finetuned-wtq"\model_class="AutoModelForTableQuestionAnswering"\tokenizer_class="AutoTokenizer"\use_cuda=True\precision="float"\quantization=0\device_map="cuda:0"\max_memory=None\torchscript=False
# For table-based QA using TAPEX:geniusQABulkrise\batch\--input_s3_bucketgeniusrise-test\--input_s3_folderinput/qa-table\batch\--output_s3_bucketgeniusrise-test\--output_s3_folderoutput/qa-table\postgres\--postgres_host127.0.0.1\--postgres_port5432\--postgres_userpostgres\--postgres_passwordpostgres\--postgres_databasegeniusrise\--postgres_tablestate\--idmicrosoft/tapex-large-finetuned-wtq-lol\answer_questions\--args\model_name="microsoft/tapex-large-finetuned-wtq"\model_class="AutoModelForSeq2SeqLM"\tokenizer_class="AutoTokenizer"\use_cuda=True\precision="float"\quantization=0\device_map="cuda:0"\max_memory=None\torchscript=False
Perform bulk question-answering using the specified model and tokenizer. This method can handle various types
of QA models including traditional, TAPAS, and TAPEX.
Parameters:
Name
Type
Description
Default
model_name
str
Name or path of the question-answering model.
required
model_class
str
Class name of the model (e.g., "AutoModelForQuestionAnswering").
'AutoModelForQuestionAnswering'
tokenizer_class
str
Class name of the tokenizer (e.g., "AutoTokenizer").
'AutoTokenizer'
use_cuda
bool
Whether to use CUDA for model inference. Defaults to False.
False
precision
str
Precision for model computation. Defaults to "float16".
'float16'
quantization
int
Level of quantization for optimizing model size and speed. Defaults to 0.
0
device_map
str | Dict | None
Specific device to use for computation. Defaults to "auto".
'auto'
max_memory
Dict
Maximum memory configuration for devices. Defaults to {0: "24GB"}.
{0: '24GB'}
torchscript
bool
Whether to use a TorchScript-optimized version of the pre-trained language model. Defaults to False.
False
compile
bool
Whether to compile the model before fine-tuning. Defaults to True.
False
awq_enabled
bool
Whether to enable AWQ optimization. Defaults to False.
False
flash_attention
bool
Whether to use flash attention optimization. Defaults to False.
False
batch_size
int
Number of questions to process simultaneously. Defaults to 32.
32
**kwargs
Any
Arbitrary keyword arguments for model and generation configurations.
{}
Processing
The method processes the data in batches, utilizing the appropriate model based on the model name
and generating answers for the questions provided in the dataset.
SummarizationBulk is a class for managing bulk text summarization tasks using Hugging Face models. It is
designed to handle large-scale summarization tasks efficiently and effectively, utilizing state-of-the-art
machine learning models to provide high-quality summaries.
The class provides methods to load datasets, configure summarization models, and execute bulk summarization tasks.
Perform bulk summarization using the specified model and tokenizer. This method handles the entire summarization
process including loading the model, processing input data, generating summarization, and saving the results.
Parameters:
Name
Type
Description
Default
model_name
str
Name or path of the translation model.
required
origin
str
Source language ISO code.
required
target
str
Target language ISO code.
required
max_length
int
Maximum length of the tokens (default 512).
512
model_class
str
Class name of the model (default "AutoModelForSeq2SeqLM").
'AutoModelForSeq2SeqLM'
tokenizer_class
str
Class name of the tokenizer (default "AutoTokenizer").
'AutoTokenizer'
use_cuda
bool
Whether to use CUDA for model inference (default False).
False
precision
str
Precision for model computation (default "float16").
'float16'
quantization
int
Level of quantization for optimizing model size and speed (default 0).
0
device_map
str | Dict | None
Specific device to use for computation (default "auto").
'auto'
max_memory
Dict
Maximum memory configuration for devices.
{0: '24GB'}
torchscript
bool
Whether to use a TorchScript-optimized version of the pre-trained language model. Defaults to False.
False
compile
bool
Whether to compile the model before fine-tuning. Defaults to True.
False
awq_enabled
bool
Whether to enable AWQ optimization (default False).
False
flash_attention
bool
Whether to use flash attention optimization (default False).
False
batch_size
int
Number of translations to process simultaneously (default 32).
32
max_length
int
Maximum lenght of the summary to be generated (default 512).
512
**kwargs
Any
Arbitrary keyword arguments for model and generation configurations.
TranslationBulk is a class for managing bulk translations using Hugging Face models. It is designed to
handle large-scale translation tasks efficiently and effectively, using state-of-the-art machine learning models
to provide high-quality translations for various language pairs.
This class provides methods for loading datasets, configuring translation models, and executing bulk translation tasks.
Parameters:
Name
Type
Description
Default
input
BatchInput
Configuration and data inputs for batch processing.
required
output
BatchOutput
Configuration for output data handling.
required
state
State
State management for translation tasks.
required
**kwargs
Arbitrary keyword arguments for extended functionality.
Perform bulk translation using the specified model and tokenizer. This method handles the entire translation
process including loading the model, processing input data, generating translations, and saving the results.
Parameters:
Name
Type
Description
Default
model_name
str
Name or path of the translation model.
required
origin
str
Source language ISO code.
required
target
str
Target language ISO code.
required
max_length
int
Maximum length of the tokens (default 512).
512
model_class
str
Class name of the model (default "AutoModelForSeq2SeqLM").
'AutoModelForSeq2SeqLM'
tokenizer_class
str
Class name of the tokenizer (default "AutoTokenizer").
'AutoTokenizer'
use_cuda
bool
Whether to use CUDA for model inference (default False).
False
precision
str
Precision for model computation (default "float16").
'float16'
quantization
int
Level of quantization for optimizing model size and speed (default 0).
0
device_map
str | Dict | None
Specific device to use for computation (default "auto").
'auto'
max_memory
Dict
Maximum memory configuration for devices.
{0: '24GB'}
torchscript
bool
Whether to use a TorchScript-optimized version of the pre-trained language model. Defaults to False.
False
compile
bool
Whether to compile the model before fine-tuning. Defaults to True.
False
awq_enabled
bool
Whether to enable AWQ optimization (default False).
False
flash_attention
bool
Whether to use flash attention optimization (default False).
False
batch_size
int
Number of translations to process simultaneously (default 32).
32
**kwargs
Any
Arbitrary keyword arguments for model and generation configurations.
AudioBulk is a class designed for bulk processing of audio data using various audio models from Hugging Face.
It focuses on audio generation and transformation tasks, supporting a range of models and configurations.
Attributes:
Name
Type
Description
model
AutoModelForAudioClassification
The audio model for generation or transformation tasks.
processor
AutoFeatureExtractor
The processor for preparing input data for the model.
Parameters:
Name
Type
Description
Default
input
BatchInput
Configuration and data inputs for the batch process.
required
output
BatchOutput
Configurations for output data handling.
required
state
State
State management for the Bolt.
required
**kwargs
Arbitrary keyword arguments for extended configurations.
{}
Methods
audio(**kwargs: Any) -> Dict[str, Any]:
Provides an API endpoint for audio processing functionality.
Accepts various parameters for customizing the audio processing tasks.
process(audio_input: Union[str, bytes], **processing_params: Any) -> dict:
Processes the audio input based on the provided parameters. Supports multiple processing methods.
Finalizes the AudioBulk processing. Sends notification email if configured.
This method should be called after all audio processing tasks are complete.
It handles any final steps such as sending notifications or cleaning up resources.
Loads and configures the specified audio model and processor for audio processing.
Parameters:
Name
Type
Description
Default
model_name
str
Name or path of the audio model to load.
required
processor_name
str
Name or path of the processor to load.
required
model_revision
Optional[str]
Specific model revision to load (e.g., commit hash).
None
processor_revision
Optional[str]
Specific processor revision to load.
None
model_class
str
Class of the model to be loaded.
''
processor_class
str
Class of the processor to be loaded.
'AutoFeatureExtractor'
use_cuda
bool
Flag to use CUDA for GPU acceleration.
False
precision
str
Desired precision for computations ("float32", "float16", etc.).
'float16'
quantization
int
Bit level for model quantization (0 for none, 8 for 8-bit).
0
device_map
Union[str, Dict, None]
Specific device(s) for model operations.
'auto'
max_memory
Dict[int, str]
Maximum memory allocation for the model.
{0: '24GB'}
torchscript
bool
Enable TorchScript for model optimization.
False
compile
bool
Enable Torch JIT compilation.
False
flash_attention
bool
Flag to enable Flash Attention optimization for faster processing.
False
better_transformers
bool
Flag to enable Better Transformers optimization for faster processing.
False
use_whisper_cpp
bool
Whether to use whisper.cpp to load the model. Defaults to False. Note: only works for these models: https://github.com/aarnphm/whispercpp/blob/524dd6f34e9d18137085fb92a42f1c31c9c6bc29/src/whispercpp/utils.py#L32
API endpoint to convert text input to speech using the text-to-speech model.
Expects a JSON input with 'text' as a key containing the text to be synthesized.
Returns:
Type
Description
Dict[str, str]: A dictionary containing the base64 encoded audio data.
Example CURL Request for synthesis:
... [Provide example CURL request] ...
SpeechToTextAPI is a subclass of AudioAPI specifically designed for speech-to-text models.
It extends the functionality to handle speech-to-text processing using various ASR models.
Attributes:
Name
Type
Description
model
AutoModelForCTC
The speech-to-text model.
processor
AutoProcessor
The processor to prepare input audio data for the model.
Methods
transcribe(audio_input: bytes) -> str:
Transcribes the given audio input to text using the speech-to-text model.
Recognizes named entities in the input text using the Hugging Face pipeline.
This method leverages a pre-trained NER model to identify and classify entities in text into categories such as
names, organizations, locations, etc. It's suitable for processing various types of text content.
Parameters:
Name
Type
Description
Default
**kwargs
Any
Arbitrary keyword arguments, typically containing 'text' for the input text.
{}
Returns:
Type
Description
Dict[str, Any]
Dict[str, Any]: A dictionary containing the original input text and a list of recognized entities.
API endpoint to transcribe the given audio input to text using the speech-to-text model.
Expects a JSON input with 'audio_file' as a key containing the base64 encoded audio data.
Returns:
Type
Description
Dict[str, str]: A dictionary containing the transcribed text.
This bolt uses the Hugging Face Transformers library to fine-tune a pre-trained model.
It uses the Trainer class from the Transformers library to handle the training.
A bolt for fine-tuning Hugging Face models for text classification tasks.
This class extends the TextFineTuner and specializes in fine-tuning models for text classification.
It provides additional functionalities for loading and preprocessing text classification datasets in various formats.
A bolt for fine-tuning Hugging Face models for text classification tasks.
This class extends the TextFineTuner and specializes in fine-tuning models for text classification.
It provides additional functionalities for loading and preprocessing text classification datasets in various formats.
A bolt for fine-tuning Hugging Face models on instruction tuning tasks.
This class inherits from TextFineTuner and specializes in fine-tuning models for instruction-based tasks.
It provides additional methods for loading and preparing datasets in various formats, as well as computing custom metrics.
Compute evaluation metrics for the model's predictions.
This method takes the model's predictions and ground truth labels, converts them to text,
and then computes the BLEU score for evaluation.
Parameters:
Name
Type
Description
Default
eval_pred
EvalPrediction
A named tuple containing predictions and label_ids.
- predictions: The logits predicted by the model of shape (batch_size, sequence_length, num_classes).
- label_ids: The ground truth labels of shape (batch_size, sequence_length).
required
Returns:
Type
Description
Optional[Dict[str, float]]
Optional[Dict[str, float]]: A dictionary containing the BLEU score. Returns None if an exception occurs.
Compute evaluation metrics for the model's predictions.
This method takes the model's predictions and ground truth labels, converts them to text,
and then computes the BLEU score for evaluation.
Parameters:
Name
Type
Description
Default
eval_pred
EvalPrediction
A named tuple containing predictions and label_ids.
- predictions: The logits predicted by the model of shape (batch_size, sequence_length, num_classes).
- label_ids: The ground truth labels of shape (batch_size, sequence_length).
required
Returns:
Type
Description
Optional[Dict[str, float]]
Optional[Dict[str, float]]: A dictionary containing the BLEU score. Returns None if an exception occurs.
Args:
input (BatchInput): The batch input data.
output (OutputConfig): The output data.
state (State): The state manager.
**kwargs: Additional keyword arguments.
A bolt for fine-tuning Hugging Face models on translation tasks.
Args:
input (BatchInput): The batch input data.
output (OutputConfig): The output data.
state (State): The state manager.
**kwargs: Arbitrary keyword arguments for extended functionality.
geniusMySQLrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argshost=localhostport=3306user=rootpassword=rootdatabase=mydbquery="SELECT * FROM table"page_size=100
geniusSQLiterise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argss3_bucket=my_s3_buckets3_key=mydb.sqlitequery="SELECT * FROM table"page_size=100
geniusPostgreSQLrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argshost=localhostport=5432user=postgrespassword=postgresdatabase=mydbquery="SELECT * FROM table"page_size=100
geniusSQLServerrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argsserver=localhostport=1433user=myuserpassword=mypassworddatabase=mydatabasequery="SELECT * FROM mytable"
geniusCockroachDBrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argshost=localhostport=26257user=rootpassword=rootdatabase=mydbquery="SELECT * FROM table"page_size=100
geniusCassandrarise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argshosts=localhostkeyspace=my_keyspacequery="SELECT * FROM my_table"page_size=100
geniusCouchbaseSpoutrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argshost=localhostusername=adminpassword=passwordbucket_name=my_bucketquery="SELECT * FROM my_bucket"page_size=100
geniusTimescaleDBrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argshost=localhostport=5432user=postgrespassword=postgresdatabase=mydbquery="SELECT * FROM hypertable"page_size=100
geniusTiDBrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argshost=localhostport=4000user=rootpassword=rootdatabase=mydbquery="SELECT * FROM table"page_size=100
geniusGoogleCloudSQLrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argshost=127.0.0.1port=3306user=rootpassword=rootdatabase=mydbquery="SELECT * FROM table"page_size=100
geniusSybaserise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argshost=localhostport=5000user=sapassword=secretdatabase=mydbquery="SELECT * FROM table"page_size=100
geniusAthenarise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argsregion_name=us-east-1output_location=s3://mybucket/outputquery="SELECT * FROM mytable"
geniusKairosDBrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argsurl=http://mykairosdbhost:8080/api/v1/datapointsquery="SELECT * FROM mymetric"
geniusNuoDBrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argsurl=http://mynuodbhost:8080/v1/statementquery="SELECT * FROM mytable"
geniusMemSQLrise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--argshost=mymemsqlhostuser=myuserpassword=<PASSWORD>database=mydatabasequery="SELECT * FROM mytable"
geniusVerticarise\batch\--output_s3_bucketmy_bucket\--output_s3_folders3/folder\none\fetch\--hostmy_hostport=5433user=my_userpassword=my_passworddatabase=my_databasequery="SELECT * FROM my_table"
The ParsePdf class is designed to process PDF files and classify them as either text-based or image-based.
It takes an input folder containing PDF files as an argument and iterates through each file.
For each PDF, it samples a few pages to determine the type of content it primarily contains.
If the PDF is text-based, the class extracts the text from each page and saves it as a JSON file.
If the PDF is image-based, it converts each page to a PNG image and saves them in a designated output folder.
Args:
input (BatchInput): An instance of the BatchInput class for reading the data.
output (BatchOutput): An instance of the BatchOutput class for saving the data.
state (State): An instance of the State class for maintaining the state.
**kwargs: Additional keyword arguments.
📖 Process PDF files in the given input folder and classify them as text-based or image-based.
Parameters:
Name
Type
Description
Default
input_folder
str
The folder containing PDF files to process.
None
This method iterates through each PDF file in the specified folder, reads a sample of pages,
and determines whether the PDF is text-based or image-based. It then delegates further processing
to _process_text_pdf or _process_image_pdf based on this determination.
The ParseCBZCBR class is designed to process CBZ and CBR files, which are commonly used for comic books.
It takes an input folder containing CBZ/CBR files as an argument and iterates through each file.
For each file, it extracts the images and saves them in a designated output folder.
Parameters:
Name
Type
Description
Default
input
BatchInput
An instance of the BatchInput class for reading the data.
required
output
BatchOutput
An instance of the BatchOutput class for saving the data.
required
state
State
An instance of the State class for maintaining the state.
The ParseDjvu class is designed to process DJVU files and classify them as either text-based or image-based.
It takes an input folder containing DJVU files as an argument and iterates through each file.
For each DJVU, it samples a few pages to determine the type of content it primarily contains.
If the DJVU is text-based, the class extracts the text from each page and saves it as a JSON file.
If the DJVU is image-based, it converts each page to a PNG image and saves them in a designated output folder.
Parameters:
Name
Type
Description
Default
input
BatchInput
An instance of the BatchInput class for reading the data.
required
output
BatchOutput
An instance of the BatchOutput class for saving the data.
required
state
State
An instance of the State class for maintaining the state.
📖 Process DJVU files in the given input folder and classify them as text-based or image-based.
Parameters:
Name
Type
Description
Default
input_folder
str
The folder containing DJVU files to process.
None
This method iterates through each DJVU file in the specified folder, reads a sample of pages,
and determines whether the DJVU is text-based or image-based. It then delegates further processing
to _process_text_djvu or _process_image_djvu based on this determination.
The ParseEpub class is designed to process EPUB files and classify them as either text-based or image-based.
It takes an input folder containing EPUB files as an argument and iterates through each file.
For each EPUB, it samples a few items to determine the type of content it primarily contains.
If the EPUB is text-based, the class extracts the text from each item and saves it as a JSON file.
If the EPUB is image-based, it saves the images in a designated output folder.
Parameters:
Name
Type
Description
Default
input
BatchInput
An instance of the BatchInput class for reading the data.
required
output
BatchOutput
An instance of the BatchOutput class for saving the data.
required
state
State
An instance of the State class for maintaining the state.
📖 Process EPUB files in the given input folder and classify them as text-based or image-based.
Parameters:
Name
Type
Description
Default
input_folder
str
The folder containing EPUB files to process.
None
This method iterates through each EPUB file in the specified folder, reads a sample of items,
and determines whether the EPUB is text-based or image-based. It then delegates further processing
to _process_text_epub or _process_image_epub based on this determination.
The ParseMOBI class is designed to process MOBI files.
It takes an input folder containing MOBI files as an argument and iterates through each file.
For each file, it extracts the images and saves them in a designated output folder.
Parameters:
Name
Type
Description
Default
input
BatchInput
An instance of the BatchInput class for reading the data.
required
output
BatchOutput
An instance of the BatchOutput class for saving the data.
required
state
State
An instance of the State class for maintaining the state.
The ParsePostScript class is designed to process PostScript files and classify them as either text-based or image-based.
It takes an input folder containing PostScript files as an argument and iterates through each file.
For each PostScript file, it converts it to PDF and samples a few pages to determine the type of content it primarily contains.
If the PostScript is text-based, the class extracts the text from each page and saves it as a JSON file.
If the PostScript is image-based, it converts each page to a PNG image and saves them in a designated output folder.
Parameters:
Name
Type
Description
Default
input
BatchInput
An instance of the BatchInput class for reading the data.
required
output
BatchOutput
An instance of the BatchOutput class for saving the data.
required
state
State
An instance of the State class for maintaining the state.
📖 Process PostScript files in the given input folder and classify them as text-based or image-based.
Parameters:
Name
Type
Description
Default
input_folder
str
The folder containing PostScript files to process.
None
This method iterates through each PostScript file in the specified folder, converts it to PDF,
reads a sample of pages, and determines whether the PostScript is text-based or image-based.
It then delegates further processing to _process_text_ps or _process_image_ps based on this determination.
The ParseXPS class is designed to process XPS files.
It takes an input folder containing XPS files as an argument and iterates through each file.
For each file, it extracts the images and saves them in a designated output folder.
Parameters:
Name
Type
Description
Default
input
BatchInput
An instance of the BatchInput class for reading the data.
required
output
BatchOutput
An instance of the BatchOutput class for saving the data.
required
state
State
An instance of the State class for maintaining the state.
The ConvertImage class is designed to convert images from one format to another.
It takes an input folder containing images and an output format as arguments.
The class iterates through each image file in the specified folder and converts it to the desired format.
Additional options like quality and subsampling can be specified for lossy formats like 'JPG'.
Parameters:
Name
Type
Description
Default
input
BatchInput
An instance of the BatchInput class for reading the data.
required
output
BatchOutput
An instance of the BatchOutput class for saving the data.
required
state
State
An instance of the State class for maintaining the state.
📖 Convert images in the given input folder to the specified output format.
Parameters:
Name
Type
Description
Default
output_format
str
The format to convert images to ('PNG' or 'JPG').
required
quality
Optional[int]
The quality of the output image for lossy formats like 'JPG'. Defaults to None.
None
subsampling
Optional[int]
The subsampling factor for JPEG compression. Defaults to 0.
0
This method iterates through each image file in the specified folder, reads the image,
and converts it to the specified output format. Additional parameters like quality and subsampling
can be set for lossy formats.
The ImageClassPredictor class classifies images using a pre-trained PyTorch model.
It assumes that the input.input_folder contains sub-folders of images to be classified.
The classified images are saved in output.output_folder, organized by their predicted labels.
📖 Classify images in the input sub-folders using a pre-trained PyTorch model.
Parameters:
Name
Type
Description
Default
classes
str
JSON string mapping class indices to labels.
required
model_path
str
Path to the pre-trained PyTorch model.
required
use_cuda
bool
Whether to use CUDA for model inference. Default is False.
False
This method iterates through each image file in the specified sub-folders, applies the model,
and classifies the image. The classified images are then saved in an output folder, organized by their predicted labels.
The TrainImageClassifier class trains an image classifier using a ResNet-152 model.
It assumes that the input.input_folder contains sub-folders named 'train' and 'test'.
Each of these sub-folders should contain class-specific folders with images.
The trained model is saved as 'model.pth' in output.output_folder.
📖 Train an image classifier using a ResNet-152 model.
Parameters:
Name
Type
Description
Default
num_classes
int
Number of classes of the images.
4
epochs
int
Number of training epochs. Default is 10.
10
batch_size
int
Batch size for training. Default is 32.
32
learning_rate
float
Learning rate for the optimizer. Default is 0.001.
0.001
use_cuda
bool
Whether to use CUDA for model training. Default is False.
False
This method trains a ResNet-152 model using the images in the 'train' and 'test' sub-folders
of input.input_folder. Each of these sub-folders should contain class-specific folders with images.
The trained model is saved as 'model.pth' in output.output_folder.
The TROCRImageOCR class performs OCR (Optical Character Recognition) on images using Microsoft's TROCR model.
It expects the input.input_folder to contain the images for OCR and saves the OCR results as JSON files in output.output_folder.
📖 Perform OCR on images in the input folder and save the OCR results as JSON files in the output folder.
This method iterates through each image file in input.input_folder, performs OCR using the TROCR model,
and saves the OCR results as JSON files in output.output_folder.
Parameters:
Name
Type
Description
Default
kind
str
The kind of TROCR model to use. Default is "printed". Options are "printed" or "handwritten".
'printed'
use_cuda
bool
Whether to use CUDA for model inference. Default is True.
The TROCRImageOCR class performs OCR (Optical Character Recognition) on images using Microsoft's TROCR model.
The class exposes an API endpoint for OCR on single images. The endpoint is accessible at /api/v1/ocr.
The API takes a POST request with a JSON payload containing a base64 encoded image under the key image_base64.
It returns a JSON response containing the OCR result under the key ocr_text.
The FineTuneTROCR class is designed to fine-tune the TROCR model on a custom OCR dataset.
It supports three popular OCR dataset formats: COCO, ICDAR, and SynthText.
Parameters:
Name
Type
Description
Default
input
BatchInput
An instance of the BatchInput class for reading the data.
required
output
BatchOutput
An instance of the BatchOutput class for saving the data.
required
state
State
An instance of the State class for maintaining the state.
required
**kwargs
Additional keyword arguments.
{}
Dataset Formats
COCO: Assumes a folder structure with an 'annotations.json' file containing image and text annotations.
ICDAR: Assumes a folder structure with 'Images' and 'Annotations' folders containing image files and XML annotation files respectively.
SynthText: Assumes a folder with image files and corresponding '.txt' files containing ground truth text.
📖 Fine-tune the TROCR model on a custom OCR dataset.
Parameters:
Name
Type
Description
Default
epochs
int
Number of training epochs.
required
batch_size
int
Batch size for training.
required
learning_rate
float
Learning rate for the optimizer.
required
dataset_format
str
Format of the OCR dataset. Supported formats are "coco", "icdar", and "synthtext".
required
use_cuda
bool
Whether to use CUDA for training. Default is False.
False
This method fine-tunes the TROCR model using the images and annotations in the dataset specified by dataset_format.
The fine-tuned model is saved to the specified output path.
The Pix2StructImageOCR class performs OCR on images using Google's Pix2Struct model.
It expects the input.input_folder to contain the images for OCR and saves the OCR results as JSON files in output.output_folder.
Parameters:
Name
Type
Description
Default
input
BatchInput
Instance of BatchInput for reading data.
required
output
BatchOutput
Instance of BatchOutput for saving data.
required
state
State
Instance of State for maintaining state.
required
model_name
str
The name of the Pix2Struct model to use. Default is "google/pix2struct-large".
The Pix2StructImageOCRAPI class performs OCR on images using Google's Pix2Struct model.
The class exposes an API endpoint for OCR on single images. The endpoint is accessible at /api/v1/ocr.
The API takes a POST request with a JSON payload containing a base64 encoded image under the key image_base64.
It returns a JSON response containing the OCR result under the key ocr_text.
Parameters:
Name
Type
Description
Default
input
BatchInput
Instance of BatchInput for reading data.
required
output
BatchOutput
Instance of BatchOutput for saving data.
required
state
State
Instance of State for maintaining state.
required
model_name
str
The name of the Pix2Struct model to use. Default is "google/pix2struct-large".
The FineTunePix2Struct class is designed to fine-tune the Pix2Struct model on a custom OCR dataset.
It supports three popular OCR dataset formats: COCO, ICDAR, and SynthText.
Parameters:
Name
Type
Description
Default
input
BatchInput
An instance of the BatchInput class for reading the data.
required
output
BatchOutput
An instance of the BatchOutput class for saving the data.
required
state
State
An instance of the State class for maintaining the state.
required
model_name
str
The name of the Pix2Struct model to use. Default is "google/pix2struct-large".
'google/pix2struct-large'
**kwargs
Additional keyword arguments.
{}
Dataset Formats
COCO: Assumes a folder structure with an 'annotations.json' file containing image and text annotations.
ICDAR: Assumes a folder structure with 'Images' and 'Annotations' folders containing image files and XML annotation files respectively.
SynthText: Assumes a folder with image files and corresponding '.txt' files containing ground truth text.
📖 Fine-tune the Pix2Struct model on a custom OCR dataset.
Parameters:
Name
Type
Description
Default
epochs
int
Number of training epochs.
required
batch_size
int
Batch size for training.
required
learning_rate
float
Learning rate for the optimizer.
required
dataset_format
str
Format of the OCR dataset. Supported formats are "coco", "icdar", and "synthtext".
required
use_cuda
bool
Whether to use CUDA for training. Default is False.
False
This method fine-tunes the Pix2Struct model using the images and annotations in the dataset specified by dataset_format.
The fine-tuned model is saved to the specified output path.
Additional keyword arguments for initializing the spout.
Keyword Arguments:
Batch input:
- input_folder (str): The input folder argument.
- input_s3_bucket (str): The input bucket argument.
- input_s3_folder (str): The input S3 folder argument.
Batch outupt:
- output_folder (str): The output folder argument.
- output_s3_bucket (str): The output bucket argument.
- output_s3_folder (str): The output S3 folder argument.
Streaming input:
- input_kafka_cluster_connection_string (str): The input Kafka servers argument.
- input_kafka_topic (str): The input kafka topic argument.
- input_kafka_consumer_group_id (str): The Kafka consumer group id.
Streaming output:
- output_kafka_cluster_connection_string (str): The output Kafka servers argument.
- output_kafka_topic (str): The output kafka topic argument.
Redis state manager config:
- redis_host (str): The host address for the Redis server.
- redis_port (int): The port number for the Redis server.
- redis_db (int): The Redis database to be used.
Postgres state manager config:
- postgres_host (str): The host address for the PostgreSQL server.
- postgres_port (int): The port number for the PostgreSQL server.
- postgres_user (str): The username for the PostgreSQL server.
- postgres_password (str): The password for the PostgreSQL server.
- postgres_database (str): The PostgreSQL database to be used.
- postgres_table (str): The PostgreSQL table to be used.
DynamoDB state manager config:
- dynamodb_table_name (str): The name of the DynamoDB table.
- dynamodb_region_name (str): The AWS region for DynamoDB.
Deployment
- k8s_kind (str): Kind opf kubernetes resource to be deployed as, choices are "deployment", "service", "job", "cron_job"
- k8s_name (str): Name of the Kubernetes resource.
- k8s_image (str): Docker image for the Kubernetes resource.
- k8s_replicas (int): Number of replicas.
- k8s_env_vars (json): Environment variables as a JSON string.
- k8s_cpu (str): CPU requirements.
- k8s_memory (str): Memory requirements.
- k8s_storage (str): Storage requirements.
- k8s_gpu (str): GPU requirements.
- k8s_kube_config_path (str): Name of the Kubernetes cluster local config.
- k8s_api_key (str): GPU requirements.
- k8s_api_host (str): GPU requirements.
- k8s_verify_ssl (str): GPU requirements.
- k8s_ssl_ca_cert (str): GPU requirements.
- k8s_cluster_name (str): Name of the Kubernetes cluster.
- k8s_context_name (str): Name of the kubeconfig context.
- k8s_namespace (str): Kubernetes namespace.", default="default
- k8s_labels (json): Labels for Kubernetes resources, as a JSON string.
- k8s_annotations (json): Annotations for Kubernetes resources, as a JSON string.
- k8s_port (int): Port to run the spout on as a service.
- k8s_target_port (int): Port to expose the spout on as a service.
- k8s_schedule (str): Schedule to run the spout on as a cron job.
The type of state manager ("none", "redis", "postgres", or "dynamodb").
required
**kwargs
Additional keyword arguments for initializing the spout.
Keyword Arguments:
Batch output:
- output_folder (str): The directory where output files should be stored temporarily.
- output_s3_bucket (str): The name of the S3 bucket for output storage.
- output_s3_folder (str): The S3 folder for output storage.
Streaming output:
- output_kafka_topic (str): Kafka output topic for streaming spouts.
- output_kafka_cluster_connection_string (str): Kafka connection string for streaming spouts.
Stream to Batch output:
- output_folder (str): The directory where output files should be stored temporarily.
- output_s3_bucket (str): The name of the S3 bucket for output storage.
- output_s3_folder (str): The S3 folder for output storage.
- buffer_size (int): Number of messages to buffer.
Redis state manager config:
- redis_host (str): The host address for the Redis server.
- redis_port (int): The port number for the Redis server.
- redis_db (int): The Redis database to be used.
Postgres state manager config:
- postgres_host (str): The host address for the PostgreSQL server.
- postgres_port (int): The port number for the PostgreSQL server.
- postgres_user (str): The username for the PostgreSQL server.
- postgres_password (str): The password for the PostgreSQL server.
- postgres_database (str): The PostgreSQL database to be used.
- postgres_table (str): The PostgreSQL table to be used.
DynamoDB state manager config:
- dynamodb_table_name (str): The name of the DynamoDB table.
- dynamodb_region_name (str): The AWS region for DynamoDB.
Additional keyword arguments for initializing the spout.
Keyword Arguments:
Batch output:
- output_folder (str): The directory where output files should be stored temporarily.
- output_s3_bucket (str): The name of the S3 bucket for output storage.
- output_s3_folder (str): The S3 folder for output storage.
Streaming output:
- output_kafka_topic (str): Kafka output topic for streaming spouts.
- output_kafka_cluster_connection_string (str): Kafka connection string for streaming spouts.
Stream to Batch output:
- output_folder (str): The directory where output files should be stored temporarily.
- output_s3_bucket (str): The name of the S3 bucket for output storage.
- output_s3_folder (str): The S3 folder for output storage.
- buffer_size (int): Number of messages to buffer.
Redis state manager config:
- redis_host (str): The host address for the Redis server.
- redis_port (int): The port number for the Redis server.
- redis_db (int): The Redis database to be used.
Postgres state manager config:
- postgres_host (str): The host address for the PostgreSQL server.
- postgres_port (int): The port number for the PostgreSQL server.
- postgres_user (str): The username for the PostgreSQL server.
- postgres_password (str): The password for the PostgreSQL server.
- postgres_database (str): The PostgreSQL database to be used.
- postgres_table (str): The PostgreSQL table to be used.
DynamoDB state manager config:
- dynamodb_table_name (str): The name of the DynamoDB table.
- dynamodb_region_name (str): The AWS region for DynamoDB.
Deployment
- k8s_kind (str): Kind opf kubernetes resource to be deployed as, choices are "deployment", "service", "job", "cron_job"
- k8s_name (str): Name of the Kubernetes resource.
- k8s_image (str): Docker image for the Kubernetes resource.
- k8s_replicas (int): Number of replicas.
- k8s_env_vars (json): Environment variables as a JSON string.
- k8s_cpu (str): CPU requirements.
- k8s_memory (str): Memory requirements.
- k8s_storage (str): Storage requirements.
- k8s_gpu (str): GPU requirements.
- k8s_kube_config_path (str): Name of the Kubernetes cluster local config.
- k8s_api_key (str): GPU requirements.
- k8s_api_host (str): GPU requirements.
- k8s_verify_ssl (str): GPU requirements.
- k8s_ssl_ca_cert (str): GPU requirements.
- k8s_cluster_name (str): Name of the Kubernetes cluster.
- k8s_context_name (str): Name of the kubeconfig context.
- k8s_namespace (str): Kubernetes namespace.", default="default
- k8s_labels (json): Labels for Kubernetes resources, as a JSON string.
- k8s_annotations (json): Annotations for Kubernetes resources, as a JSON string.
- k8s_port (int): Port to run the spout on as a service.
- k8s_target_port (int): Port to expose the spout on as a service.
- k8s_schedule (str): Schedule to run the spout on as a cron job.
Command-line interface for managing spouts and bolts based on a YAML configuration.
The YamlCtl class provides methods to run specific or all spouts and bolts defined in a YAML file.
The YAML file's structure is defined by the Geniusfile schema.
Run the command-line interface for managing spouts and bolts based on provided arguments.
Please note that there is no ordering of the spouts and bolts in the YAML configuration.
Each spout and bolt is an independent entity even when connected together.
The Bolt class is a base class for all bolts in the given context.
It inherits from the Task class and provides methods for executing tasks
both locally and remotely, as well as managing their state, with state management
options including in-memory, Redis, PostgreSQL, and DynamoDB,
and input and output data for batch, streaming, stream-to-batch, and batch-to-streaming.
The Bolt class uses the Input, Output and State classes, which are abstract base
classes for managing input data, output data and states, respectively. The Input and
Output classes each have two subclasses: StreamingInput, BatchInput, StreamingOutput
and BatchOutput, which manage streaming and batch input and output data, respectively.
The State class is used to get and set state, and it has several subclasses for different types of state managers.
The Bolt class also uses the ECSManager and K8sManager classes in the execute_remote method,
which are used to manage tasks on Amazon ECS and Kubernetes, respectively.
Usage
Create an instance of the Bolt class by providing an Input object, an Output object and a State object.
The Input object specifies the input data for the bolt.
The Output object specifies the output data for the bolt.
The State object handles the management of the bolt's state.
This static method is used to create a bolt of a specific type. It takes in an input type,
an output type, a state type, and additional keyword arguments for initializing the bolt.
The method creates the input, output, and state manager based on the provided types,
and then creates and returns a bolt using these configurations.
Parameters:
Name
Type
Description
Default
klass
type
The Bolt class to create.
required
input_type
str
The type of input ("batch" or "streaming").
required
output_type
str
The type of output ("batch" or "streaming").
required
state_type
str
The type of state manager ("none", "redis", "postgres", or "dynamodb").
required
**kwargs
Additional keyword arguments for initializing the bolt.
Keyword Arguments:
Batch input:
- input_folder (str): The input folder argument.
- input_s3_bucket (str): The input bucket argument.
- input_s3_folder (str): The input S3 folder argument.
Batch output config:
- output_folder (str): The output folder argument.
- output_s3_bucket (str): The output bucket argument.
- output_s3_folder (str): The output S3 folder argument.
Streaming input:
- input_kafka_cluster_connection_string (str): The input Kafka servers argument.
- input_kafka_topic (str): The input kafka topic argument.
- input_kafka_consumer_group_id (str): The Kafka consumer group id.
Streaming output:
- output_kafka_cluster_connection_string (str): The output Kafka servers argument.
- output_kafka_topic (str): The output kafka topic argument.
Stream-to-Batch input:
- buffer_size (int): Number of messages to buffer.
- input_kafka_cluster_connection_string (str): The input Kafka servers argument.
- input_kafka_topic (str): The input kafka topic argument.
- input_kafka_consumer_group_id (str): The Kafka consumer group id.
Batch-to-Streaming input:
- buffer_size (int): Number of messages to buffer.
- input_folder (str): The input folder argument.
- input_s3_bucket (str): The input bucket argument.
- input_s3_folder (str): The input S3 folder argument.
Stream-to-Batch output:
- buffer_size (int): Number of messages to buffer.
- output_folder (str): The output folder argument.
- output_s3_bucket (str): The output bucket argument.
- output_s3_folder (str): The output S3 folder argument.
Redis state manager config:
- redis_host (str): The Redis host argument.
- redis_port (str): The Redis port argument.
- redis_db (str): The Redis database argument.
Postgres state manager config:
- postgres_host (str): The PostgreSQL host argument.
- postgres_port (str): The PostgreSQL port argument.
- postgres_user (str): The PostgreSQL user argument.
- postgres_password (str): The PostgreSQL password argument.
- postgres_database (str): The PostgreSQL database argument.
- postgres_table (str): The PostgreSQL table argument.
DynamoDB state manager config:
- dynamodb_table_name (str): The DynamoDB table name argument.
- dynamodb_region_name (str): The DynamoDB region name argument.
{}
Returns:
Name
Type
Description
Bolt
Bolt
The created bolt.
Raises:
Type
Description
ValueError
If an invalid input type, output type, or state type is provided.
The Spout class is a base class for all spouts in the given context.
It inherits from the Task class and provides methods for executing tasks
both locally and remotely, as well as managing their state, with state management
options including in-memory, Redis, PostgreSQL, and DynamoDB,
and output data for batch or streaming data.
The Spout class uses the Output and State classes, which are abstract base
classes for managing output data and states, respectively. The Output class
has two subclasses: StreamingOutput and BatchOutput, which manage streaming and
batch output data, respectively. The State class is used to get and set state,
and it has several subclasses for different types of state managers.
The Spout class also uses the ECSManager and K8sManager classes in the execute_remote method,
which are used to manage tasks on Amazon ECS and Kubernetes, respectively.
Usage
Create an instance of the Spout class by providing an Output object and a State object.
The Output object specifies the output data for the spout.
The State object handles the management of the spout's state.
Example
output = Output(...)
state = State(...)
spout = Spout(output, state)
The type of state manager ("none", "redis", "postgres", or "dynamodb").
required
**kwargs
Additional keyword arguments for initializing the spout.
Keyword Arguments:
Batch output:
- output_folder (str): The directory where output files should be stored temporarily.
- output_s3_bucket (str): The name of the S3 bucket for output storage.
- output_s3_folder (str): The S3 folder for output storage.
Streaming output:
- output_kafka_topic (str): Kafka output topic for streaming spouts.
- output_kafka_cluster_connection_string (str): Kafka connection string for streaming spouts.
Stream to Batch output:
- output_folder (str): The directory where output files should be stored temporarily.
- output_s3_bucket (str): The name of the S3 bucket for output storage.
- output_s3_folder (str): The S3 folder for output storage.
- buffer_size (int): Number of messages to buffer.
Redis state manager config:
- redis_host (str): The host address for the Redis server.
- redis_port (int): The port number for the Redis server.
- redis_db (int): The Redis database to be used.
Postgres state manager config:
- postgres_host (str): The host address for the PostgreSQL server.
- postgres_port (int): The port number for the PostgreSQL server.
- postgres_user (str): The username for the PostgreSQL server.
- postgres_password (str): The password for the PostgreSQL server.
- postgres_database (str): The PostgreSQL database to be used.
- postgres_table (str): The PostgreSQL table to be used.
DynamoDB state manager config:
- dynamodb_table_name (str): The name of the DynamoDB table.
- dynamodb_region_name (str): The AWS region for DynamoDB
{}
Returns:
Name
Type
Description
Spout
Spout
The created spout.
Raises:
Type
Description
ValueError
If an invalid output type or state type is provided.
Consume messages from a Kafka topic and save them as JSON files in the input folder.
Stops consuming after reaching the latest message or the specified number of messages.
Parameters:
Name
Type
Description
Default
input_topic
str
Kafka topic to consume data from.
required
kafka_cluster_connection_string
str
Connection string for the Kafka cluster.
required
nr_messages
int
Number of messages to consume. Defaults to 1000.
1000
group_id
str
Kafka consumer group ID. Defaults to "geniusrise".
'geniusrise'
partition_scheme
Optional[str]
Optional partitioning scheme for Kafka, e.g., "year/month/day".
None
Returns:
Name
Type
Description
str
str
The path to the folder where the consumed messages are saved as JSON files.
Partitioning scheme for S3, e.g., "year/month/day".
Raises:
Type
Description
FileNotExistError
If the output folder does not exist.
Parameters:
Name
Type
Description
Default
output_folder
str
Folder to save output files.
required
bucket
str
S3 bucket name.
required
s3_folder
str
Folder within the S3 bucket.
required
partition_scheme
Optional[str]
Partitioning scheme for S3, e.g., "year/month/day".
None
Usage
# Initialize the BatchOutput instanceconfig=BatchOutput("/path/to/output","my_bucket","s3/folder",partition_scheme="%Y/%m/%d")# Save data to a fileconfig.save({"key":"value"},"example.json")# Compose multiple BatchOutput instancesresult=config1.compose(config2,config3)# Convert output to a Spark DataFramespark_df=config.to_spark(spark_session)# Copy files to a remote S3 bucketconfig.to_s3()# Flush the output to S3config.flush()# Collect metricsmetrics=config.collect_metrics()
Using from_streamz method to process streamz DataFrame¶
input=StreamingInput("my_topic","localhost:9094")streamz_df=...# Assume this is a streamz DataFrameforrowininput.from_streamz(streamz_df):print(row)
Using from_spark method to process Spark DataFrame¶
input=StreamingInput("my_topic","localhost:9094")spark_df=...# Assume this is a Spark DataFramemap_func=lambdarow:{"key":row.key,"value":row.value}query_or_rdd=input.from_spark(spark_df,map_func)
Using compose method to merge multiple StreamingInput instances¶
This class provides a foundation for creating and managing tasks. Each task has a unique identifier and can be associated with specific input and output data.
🖨️ Pretty print the fetch_* methods and their parameters along with their default values and docstrings.
Also prints the class's docstring and init parameters.